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Abstract 


In the last few years, significant advances have been made in under- 
standing the distributions of exoplanet populations and the architecture 
of planetary systems. We review the recent progress of planet statis- 
tics, with a focus on the inner < 1 AU region of the planetary system 
that has been fairly thoroughly surveyed by the Kepler mission. We 
also discuss the theoretical implications of these statistical results for 
planet formation and dynamical evolution. 
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1. INTRODUCTION 


“Who ordered that?” said the theorist I. Rabi when learning about the unexpected discovery 
of muons in 1936. Little did particle physicists know that it would only be the beginning of 
uncovering a puzzling “particle zoo” filled with diverse particles in the next three decades, 
until revolutionary theoretical insights were developed to classify the elementary particles. 
Now nearly three decades since the astonishing discovery of a hot Jupiter 
1995), the “exoplanet zoo" is ever growing — whenever the detection territories grow in 
breadth or depth, nature appears to be teeming with new species. Theorists working on 
planet formation and evolution face distinctly different sets of challenges from particle 
physicists: in the popular paradigm, forming planets from dust grains is a daunting march 
spanning tens of orders of magnitudes in mass and involves many physical processes that 
are too complex for first-principle calculations. In hindsight, it should probably be of little 
surprise that a theory involving such complicated physics, which was anchored by the sole 
sample of our solar system, would have limited predictive success. 

We review the recent progress of planet statistics and identify patterns emerging from 
the known thousands of exoplanets that cover a broad region of the parameter space (see 
Figure [1). Robustly identifying patterns in the intrinsic distributions of planets can stim- 
ulate and test theories. Conversely, theoretical advances may also beam the searchlight 
on fresh observational ground, as exemplified by the development of the photoevaporation 
theory leading to the recent discovery of a “radius valley" (see Section 2.1.4). Since the 
last Annual Reviews article on exoplanet populations , the field of 
planet statistics has made significant progress. In particular, the large and homogeneous 
planet sample from the NASA Kepler mission has provided the best 
source for statistical studies, but a major shortcoming of the Kepler data was the initial lack 
of accurate stellar parameters for both the planet hosts and the target stars (i.e., the parent 
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Figure 1 


Mass versus semi-major axis of known planets, based on the “Confirmed Planets” list from NASA Exoplanet Archive 
(Akeson et al.]2013| acquired in September of 2020) and the reliable Kepler planet candidates (see Section [2] for more 
details). We differentiate with different colors planet detections as well as the approximate sensitivity curves from 
ground-based transit (purple), Kepler survey (blue), RV surveys (orange), microlensing (green), and direct imaging 
(brown). The masses of the Kepler detections are estimated from the measured radii according to the[Chen & Kipping] 
mass-radius relation. The sensitivity curve of Kepler is also converted in a similar way from that measured in the 
radius-period plane (see Figure [2}. The sensitivity curve for the 10-yr Gaia astrometry survey is also shown in red, for 
which we have assumed a Sun-like host at 20 pc and required a 3-o detection over the expected precision. For space-based 
microlensing, we adopt the sensitivity curve of the microlensing survey that will be performed by the Nancy Grace Roman 


Space Telescope (formerly known as WFIRST, 2019). Images of the solar system planets from NASA are 
shown at their corresponding locations. 


sample). In the last few years, substantial efforts have been dedicated to systematically 
characterize the Kepler sample and thus unleash its potential for statistical studies. These 


include asteroseismology (e.g., [Van Eylen & Albrecht|/2015), the 
Gaia data releases (Gaia Collaboration et al.[2016] 2018), follow-up spectroscopic programs 
such as the LAMOST- Kepler survey (e.g., 
and the California- Kepler Survey (CKS, Johnson et al.|2017 ; 


as well as many projects of the Kepler Follow-up Observation Program (KFOP, 
[et al. [2017]. Moreover, substantial works to understand the Kepler pipeline detection ef- 
ficiency and vetting false positives have much improved the reliability of Kepler statistical 
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Frequency of planets: 


the average number 
of planets per star 
fp (Equation [1]. 
Frequency of 
planetary systems: 
the fraction of stars 
with planets Fp 
(Equation []. 
Average planet 
multiplicity: the 
average number of 
planets per 
planetary system 


(Equation |3). 


inference (e.g., 2015 [Morton et al.]2016). Last but not least, in-depth 


developments have been recently made to disentangle the intricate observational biases of 
multi-planet systems. These efforts have made it possible to offer new insights into planet 
distributions and architectures. 

In this review, we first clarify in Sections and several common confusions in 
exoplanet statistical studies. Then we discuss planet distributions in the inner (< 1 AU) 
and the outer (~ 1-10 AU) regions in SectionsD]and[3] respectively. The former is focused on 
results from the Kepler mission, and the latter includes updated results from radial velocity 
(RV) and gravitational microlensing. A brief discussion of the free-floating planets (FFPs) 
from microlensing is also provided. We focus on planets around 2Gyr-old stars, while 
planets orbiting young stars found by direct imaging are not discussed (see the review by 
Bowler/2016). The implications to theories of planet formation and evolution are discussed 
in Section [4] Finally in Section [5] we summarize and outline the promising directions for 
future developments. 


1.1. On defining and interpreting planet "occurrence rate" 


Many statistical studies focus on deriving the intrinsic “occurrence rate" (or the often 
interchangeably used term “frequency” ) of planets. But from one study to another, the same 
term can carry different meanings. In the following we clarify these different definitions to 
avoid further misinterpretations. 

In most studies, the derived occurrence rate is the average number of planets per star, 
and we denote it as ñp, which is defined as 


e Total # of planets 
P— Total # of stars ` 


Here a planet is restricted to lie within a predefined parameter space, often in the period- 
radius plane (for the transit method) or the period—mass (or minimum mass mp sin i) plane 
(for the RV method). Similarly, a star is restricted to a star-like target of predefined prop- 
erties. Since a large fraction of such stars may actually have unresolved stellar companions, 
the correction for the impact of the stellar binarity can be important for the inference of 
the planet formation efficiency (see Section D.5.1]. 

Another important quantity sometimes referred to as occurrence rate is the fraction of 


stars with planets Fp 
. Total # of planetary systems 


FP = 
E Total # of stars 


2. 


Here a planetary system has at least one planet existing in a predefined parameter space. 
By definition Fp < 1, so it is usually reported as a percentage. However, an occurrence rate 
reported as a percentage (i.e., “X% of stars have planets") does not necessarily mean that 
it is the fraction of stars that are hosts of planets, since Np is also frequently reported as a 
percentage. 

To distinguish between the two definitions, we refer to Np as the frequency of planets 
and Fp as the frequency of planetary systems. The ratio of the two measures the average 
number of planets per planetary system (within a predefined parameter space), which we 
call average planet multiplicity and denote mp 


Ap _ Total # of planets 
F, Total # of planetary systems’ 
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Kepler data suggest that multi-planet systems are common, so usually mp is larger than 
unity, and consequently Np and Fp substantially differ from each other. They only become 
similar when the average planet multiplicity mp — 1, which can happen when either a) 
a category of planets with low intrinsic multiplicity (e.g., short-period giant planets) is 
concerned, or b) the parameter space of interest is small enough that systems with more 
than one such planet are rare. 

The three quantities, np, Fp, and Mp, are all important for testing theories. To provide 
a simple example, with only np measured to be unity, it is possible that all stars have one 
planet (Fp = 10096 and mp = 1) or that half of the stars have two planets (Fp = 50% and 
Mp = 2). These two cases obviously demand different theoretical explanations. 

Observationally, the derivations of np and F, have rather different requirements and 
follow different procedures. It is generally more straightforward to derive np, since correct- 
ing the detectability of individual planets concerns observables directly measurable from 
surveys (e.g., planet size and orbital period for transit, assuming that the properties of the 
stars are known). In contrast, the detectability of a planetary system usually concerns the 
intrinsic architecture of the system, including the planet multiplicity and distributions of 
the orbital and physical parameters, many of which may not be directly observable, so the 
derivation of Fp can rely on assumptions of these unknowns. This is especially an issue in 
transit surveys: the derivation of Fp requires assumptions about the mutual inclinations 
between planets, and different assumptions can lead to fairly different values of Fp (see 
Section [2.2). 

In deriving the two frequencies, statistical studies involving multi-planet systems usually 
treat the planet occurrence as a Poisson process. This may be a reasonable assumption in the 
derivation of the planet frequency np, but it can lead to unreliable results in the derivation 
of the planetary system frequency F,. This Poisson process assumption implies that the 
presences of individual planets in the same system are independent and that their physical 
and orbital properties are independent of the properties of other planets or of the host star. 
As discussed later in this review, such an assumption breaks down in certain circumstances. 
Below we provide a specific example to demonstrate its impact on the planetary system 
frequency. The fractions of Sun-like stars with cold giant planets and with planets that 
Kepler is sensitive to are 1096 and 30%, respectively. The fraction of such stars with at 
least one planet in the joint parameter space would be 1— (1— 3096) x (1— 1096) = 37% under 
the Poisson process assumption. However, this frequency is determined to be — 3096 as a 
result of the strong correlation between the inner and the outer planets (Section [3.2). The 
correlations (or sometimes anti-correlations) between the occurrences of planets around the 
same host also suggest that one may not be able to extrapolate a parameterized distribution 
of the planetary system frequency to a parameter space that is not covered by the data. 

A number of studies have reported Fp by using the detectability of the first detected (or 
the most detectable) planet in the system as that of the whole system (e.g.,}; Cumming et al. 
Bios] [Mayor et aL [ULT] [Fressin et al BOIS) [Petigura ct al 2013}. This approach does not 
require assumptions on planet multiplicity or architecture. However, as the detectability 
of any planet is no greater than the detectability of the system it resides in, this approach 


typically tends to overestimate Fp (Zhu et al.||2018b). 
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1.2. On inferring the frequency of planets 


In this section, we discuss the commonly used methods of inferring the frequency of planets 
fip from a statistical survey of N, target stars. 

A popular method is the so-called inverse detection efficiency method (IDEM), which has 
been used extensively in the literature, including many influential studies (e.g., [Mayor et al.) 
[2015]. For our illustrative survey, the average number of planets per star according 
to IDEM is 


N, 

11 N (5) 

-IDEM p 

n = = ; 4. 
i Nn N, P 


Here p; is the survey detection efficiency of the i-th of Np detected planets and (-) is the 
average over all detected planets. IDEM is intuitive, simple to perform, and computationally 
efficient, as it does not require computing the detection efficiencies of null detections (which 
are usually the majority of the targets), so it is useful in getting a rough estimate of the 
underlying frequency. However, this method is not rigorously established in the probability 
theory and can potentially lead to biased results IE 
2018). Specifically, with a low detection efficiency and a small number of detections, 
found that IDEM often leads to underestimated np since the actual detections 
typically come from targets with larger-than-average sensitivities. IDEM can also suffer 
substantial fluctuations because of the inversion of the (typically small) detection efficiency. 

An approach with sound statistical basis is modeling planet occurrence as a Poisson 


process and performing maximum likelihood analysis (e.g., 
2015). In a given bin that has N, planet detections, see their Section 3.1) 
and see their Appendix A) show that the maximum likelihood 


(ML) estimator for planet frequency is 


PNE WE CAE 
"o Xe Mo NUS 


Here p; is the planetary detection efficiency in the bin for the j-th star, regardless of 
whether the star yields any actual planet detection or not, and N° is the effective sample 
size. Unlike in Equation (4), the average here is performed among all stars in the 
sample. Compared to IDEM, this method is computationally more expensive, while being 
statistically superior. It is more robust against fluctuations in the efficiencies of individual 
detections (as well as null detections) because the averaging is performed on p rather than 
1/p. 

Next we elaborate on incorporating the above approach into the Bayesian framework 
following the simplified Bayesian model of [Hsu et al.] see their Appendix B) but 
with some corrections. The posterior probability distribution of planet frequency nj for the 
statistical sample is given by 


P(Ap|Np, Nf) x P(Np|ñp, Ne Pori (ñp). 6. 


The first term on the right-hand side quantifies the probability (or likelihood) of having the 
Ny detections for a given rate ñp, which under the Poisson process assumption is described 
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by a Gamma distribution[!] The second term, Ppri(7ip), is the prior distribution of ñp. If a 
conjugate prior is assigned as a Gamma distribution with a shape parameter ao and a rate 
parameter £o, the resulting posterior distribution is then a Gamma distribution with the 
shape parameter ao + Np and the rate parameter fo + NË 


= -ñ eff 
P(ñp| Np, NEF) oc ngo * Ne "197 n» (Bo NT). a 


For a flat prior on ñp, the two parameters are ag = 1 and fo = 0, respectively. For 
completeness, the mean and standard deviation of this Gamma distribution posterior are 


- Qo + Np m y O0 + Np 
= e = —“—___. 8. 
un) Bo + Neff’ o (ny) Bo 4 Nef 


The first expression reduces to the ML estimator of Equation [5] if a log-flat prior on the 
planet frequency ñp is assumed (i.e., ao = fo = 0). The expressions given by Equation 
provide easy-to-use estimates to report when the number of detections is relatively large. 
However, when small or null detections are involved, the posterior probability distribution 
is fairly non-Gaussian. It is then more appropriate to report the median value, the 6896 
credible interval, and/or the 9596 upper limit, all of which can be derived from the cu- 
mulative posterior probability distribution. It is also worth noting that, in the case of null 
detections, a meaningful upper limit on ñp cannot be derived with the log-flat prior because 
the shape parameter becomes zero and the Gamma distribution is undefined. We show in 
Section [2-1] an application of the Bayesian approach to derive the frequency of planets in 
the Kepler parameter space. 


2. THE INNER PLANETARY SYSTEM 


We review in this section planet statistics in the inner region (S 1 AU of Sun-like stars), 
which is well explored thanks to thousands of planets detected by the RV and transit 
techniques. We focus on the best statistical probe by far of the inner region—the large and 
uniform sample from the Kepler mission, which is sensitive to transiting planets with radii 


Ry down to ^ R and orbital periods P up to ~ 1 yr (Borucki et al. 2010). 


We first derive a clean baseline sample based on the final Kepler data release (DR25, 
Thompson et al.[2018) and the improved stellar parameters from (2020b). The 
latter work combines the astrometric measurements from Gaia DR2 (Gaia Collaboration| 
with the available photometric and spectroscopic information to yield stellar 
radii with a median uncertainty of 4%. Starting from the DR25 planet catalog, we have 


removed planet candidates with: a) transit signal-to-noise ratio (S/N) below the nominal 
threshold (S/N= 7.1), b) NASA Exoplanet Archive P| disposition flag being false positive, 
c) the derived planetary radius Rp > 20 Re, d) the orbital period P > 400 days, and e) the 
best-fit transit impact parameter b > 1. We restrict to Sun-like stars that are defined as 
main-sequence stars (as classified by|Berger et al.[2018) with effective temperatures between 
4700 K and 6500K. The bulk of this section is about planets around Sun-like hosts, and 


1A Gamma distribution can be parameterized in terms of a shape parameter a (> 0) and 
a rate parameter 8 (> 0). The probability density function of a variable x is f(z; o, 8) = 
(B&z*—1e-87)/T(a) oc z*—!e-P*, where I'(a) is the Gamma function evaluated at a. 

2 exoplanetarchive.ipac.caltech.edu 
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topics such as correlations with various stellar properties, such as stellar mass, metallicity 
and binarity, are discussed in Section [2.5] 

The baseline sample contains 2,525 planet detections around 98,213 Sun-like stars. Of 
all the transiting planets, 1451 are found in systems with only one detected transiting planet 
Plana the remaining 1074 are from systems with multiple detected transiting planets. The 
average observed multiplicity rate, namely the average fraction of planets from known multi- 
planet systems, is 42.5%. This is a lower limit on the intrinsic multiplicity rate, as many 
of the single-planet systems seen in Kepler are likely part of intrinsic multi-planet systems 
(see details in Section [2.2). The observed transit multiplicity distribution in the sample is 


(Ni, Na, Ns, Na, Ns, Ne, N7) = (1451, 278, 97, 37, 12, 2, 1) 9. 


and no system has more than seven transiting planets[!] Figure [2] illustrates the planets 
in our sample in the radius-period plane. Different multiplicities of transiting planets are 
shown with different symbols. 


2.1. Planet distribution in the radius-period plane 


With the above statistical sample we derive the planet frequencies in the Bayesian frame- 
work of Section [1.2] 'The parameter space in the radius-period plane is divided into log- 
arithmically equally-spaced cells In each cell, the number of planet detections, Np, is 
found and the average detection efficiency, (p), is computed via 


- d [5 "(Ro /a)S(P, Ry)dln Pdln Rp 
p) = . 


[see pes din Plu Rp 


Rp,min n 


10. 


Here Hp,min, Rp,max; Pmin, and Pmax denote the boundaries of the cell, and Re/a is ap- 
proximately the transit geometric probability at semi-major axis a around a Sun-like host. 
The sensitivity due to survey detection thresholds at a given period and radius, S(P, Rp), 


is computed with the KeplerPORTs code, i which was first developed in |Burke et al.| (2015) 


and further updated for Kepler DR25 (Burke & Catanzarite|2017a) by incorporating results 
of transit injection and recovery tests for the final Kepler pipeline 
2020). Updated stellar parameters were used to derive the mean 
sensitivity curve. 

We adopt a flat prior on np, and its posterior distribution is then described by the 
Gamma distribution of Equation [7] with ao = 1 and fio = 0. For cells with < 2 detections 
we report the 9596 upper limits, whereas for the rest the means and the standard devia- 
tion given by Equation [8] are reported as the measurements and associated uncertainties, 
respectively. We have verified that the deviation between the mean and the median is 
substantially smaller than the uncertainty for all relevant cells. 


3We sometimes use the contraction “tranet” to stand for “transiting planet" in the text and 
figure legends and captions. 

^Note that the only system in our sample with seven transiting planets, Kepler-90, has been 
found to contain one additional planet candidate (Shallue & Vanderburg]/2018). However, this 
additional candidate was not found by the Kepler DR25 pipeline and thus not included. 

5Since the typical precisions of planetary period and radius are much smaller than the cell sizes, 


we ignore the uncertainties of planetary parameters. See|Foreman-Mackey et al.| (2014) for how to 
incorporate the planet parameter errors in the analysis. 
®The code is publicly available at https: //github.com/nasa/KeplerPORTs 
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The close-in (P < 400 days) Kepler planets in the radius-period plane, with various symbols and 
colors indicating the observed multiplicity (note that we use the contraction “tranet” to refer to 
“transiting planet” in the legend). The gray solid curve indicates the median detection efficiency 
of the planet search pipeline. The solar system planets in the inner region, namely Mercury, 
Venus, and Earth, are denoted with their first letters. The median precision on the planetary 
radius is ~ 7%. The radii of the Jupiter and Neptune are shown with horizontal dashed lines, 
respectively. The radius valley at ~ 2 Rg (see Section [2.1.4} is visible. 


The derived planet frequency map is shown in Figure |3} For cells with more than two 
detections, we also indicate the observed multiplicity rate of planets in the cell. Again, 
these multiplicity rates represent the lower limits on the fraction of planets in those cells 
that reside in multi-planet systems. We summarize several key results below: 


e The integrated planet frequency is Np = 1.23+0.06 for planets with radii in the range 
1-20 Re and orbital periods up to 400 days. This is broadly consistent with results 
from previous studies (e. [Fressin ct al 2013} [Petigura et aL 201] [Hsu et al O19). 
As stressed in Section [I-1] statistical analyses like this one do not yield the fraction 
of stars with planets Fp, as the impact of the multiplicity and the mutual inclination 
has not been taken into account (see Section [2.2). 

e As it has been clear since the earliest Kepler statistical studies, there are generally 
many more small planets with radii Rp X 4 Rẹ than larger ones, for orbital periods 
P « 400days. Planet frequencies tend to increase from the upper left (large Rp 
and small P) toward the lower right (small Rp and large P). In other words, the 
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This figure illustrates the planet frequencies (ñp) and the observed multiplicity fractions based on the planet sample in 
Figure [2] 'The numbers and error bars are the average number of planets with periods and radii within the given cell per 
100 Sun-like stars. If there are less than three detections found within the cell, then the 9596 upper limit is reported 
instead. These upper limits are highlighted in red. The fraction in each cell denotes the observed multiplicity fraction, 
namely the fraction of planets in that cell found to reside in multi-planet systems. We use “N/A” for cells with less than 
three detections. T'he red dashed lines mark the regions corresponding to hot Jupiters, hot Neptune "desert," and USPs. 


intrinsic radius distribution is dependent on the orbital period (e.g.,|Dong & Zhu 
2013| |Foreman-Mackey et al.|2014 2018). There exist some local regions 


where the general trends break down, such as the radius valley (see Section D.1.4). 

e Sub-Earths (Rp < 1 Rẹ) and Earth-sized planets in Earth-like orbits are not well 
probed by Kepler, and thus estimates of their frequencies are most susceptible to 
the uncertainties of survey sensitivity estimates. As a result, there remain large 
discrepancies on their intrinsic frequencies in the literature (see Table 2 of 
and Figure 17 of [Burke et al.[2015). 

e Kepler planets commonly reside in multi-planet systems in most parts of the radius- 
period plane, with some notable exceptions such as the hot Jupiter region 
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et al.12012| see Section for more discussion). The intrinsic multiplicity rates are 


likely higher than the observed multiplicity rates shown in Figure We defer to 
Section [2.2] for further discussions. 


The above method to derive the planet frequency mp is non-parametric. An alternative 
approach employs a parameterized planet distribution function and then constrains the 
associated parameters. 'l'he parametric approach has been widely used in statistical studies 
of various detection techniques, including transit (e.g., 
[Burke et aL.[2015), RV (e.g. [Tabachnik & Tremaine 2002] Cumming et al. 
2008), and microlensing studies (e.g., 


2016). It is also commonly used in simulations of generating synthetic planetary systems 


(e.g., Mulders et al.|2018 2019). The commonly adopted planet distribution 


function is separable between the orbital period (or semi-major axis) and the planetary 


radius (or mass) 


d^N dN dN 


dinPdinR, ^ din P din Ry i 


The distributions of the orbital period and the planetary radius are usually parameterized 
as power laws or broken power laws. The use of such a separable function implicitly assumes 
that the period (radius) distribution is independent of the planetary radius (period). As 
discussed above, such an assumption is not valid for the inner planetary system. It is 
likely not valid for planet distributions in other regions of the parameter space, either. The 
implications of this failure on the derived occurrence rates from the parametric method and 
on the theoretical interpretations of the underlying population have not been fully explored. 

In what follows, we provide brief discussions about selected regions in the radius-period 
plane. 


2.1.1. Hot Jupiters. As the first type of exoplanets found around solar-type stars 
[Queloz]1995), hot Jupiters (8 Re < Rp < 20 Rẹ and P < 10days) remain interesting and 
exciting targets for both observational and theoretical purposes. Here we only review the 
occurrence and multiplicity rates of hot Jupiters in the current context and refer interested 
readers to the recent review by for more in-depth discussions 
about the hot Jupiter population. 

There is a long-standing discrepancy between the hot Jupiter frequency inferred from 


RV and transit surveys (e.g., (Gould et al.|2006a| |Wright et al.|2012| see Table B9 of 
2016| for an incomplete list). For example, our statistical sample yields a 


rate of 0.62 + 0.0996, which is in good agreement with previous studies of the hot Jupiter 


frequency in the Kepler field (e.g., 2012 [Fressin et al./2013 


2016), whereas the RV surveys of stars in the Solar Neighborhood report rates that are 


typically a factor of ~2 higher (0.9-1.2%; 2011||Wright et al.|2012). It was 


suggested that the discrepancy could be caused by the different stellar properties, such as 


age, metallicity, and binary fraction, between the RV and transit samples. This has been 
tested by several follow-up studies of the Kepler sample. The Kepler stars are only slightly 
sub-solar on average (([Fe/H]) copier © —0.04; [Dong et al.|/2014b}, and their metallicity 
differences with the RV targets (([Fe/H]) py œ% 0.0) seem to be too small to fully account for 
the discrepancy even given the steep dependence of hot Jupiter frequency with metallicity 
(Guo et al./2017). The unresolved binaries are also unlikely to substantially change the hot 


Jupiter frequency in the Kepler sample (Bouma et al.[2018). However, because RV surveys 
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preferentially exclude close (~ 1-50 AU) stellar binaries from their sample, this discrepancy 
in hot Jupiter frequencies between transit and RV surveys can potentially be resolved if 
the formation of hot Jupiters are suppressed in such close binary systems (Moe & Kratter| 
2019). Searching for stellar companions of transiting hot Jupiters (e.g., 
and making comparisons with field stars is a promising way to further test this possibility. 

As shown in Figure [2] one out of the 49 hot Jupiters in our statistical sample, Kepler- 


730b, has a nearby small planet companion (Zhu et al.|2018a 2019). As of 


writing, only two other hot Jupiters, WASP-47b (Becker et al.[2015) and TOI-1130c 
2020) are known to share the same property. Our statistical sample suggests that 


~ 296 (< 9.796, 95% upper limit) of hot Jupiters have nearby (< 20 days), small (71-4 Re) 
and nearly coplanar companions (see also for the constraint on non- 
coplanar companions). This low multiplicity rate of hot Jupiters supports the general idea 
that most of T ST undergone some large-scale migrations to arrive at current locations 


(e.g., Rasio & Fordi Rasio & Ford|1996a] [Weidenschilling & Marzari| Weidenschilling & Marzari|1996). We refer to 
Dawson & — (oa for more in-depth discussions on this topic. 


2.1.2. Hot Neptune "desert". The region P < 4days and 2Rẹ S Rp S 8 Ra lands in 


the so-called hot Neptune (or sub-Jovian) “desert” (e.g., Szabo & Kiss]BULi] [Beaugé &] 
[Nesvorny 2013] [Mazeh et al.|2016] 2016| and references therein), which is considered underpop- 


ulated, especially when inspecting mixed planet samples found in surveys with different 
detection sensitivities (e.g., ground-based transits and Kepler). This "desert" is however 


not that barren: the total planet frequency enclosed in the above region is 0.61 + 0.0796 
from our statistical analysis (see Figure [3). making this hot Neptune “desert” similarly 
populated as the hot Jupiter region (see also [Dong et al.[2018]. While the above frequency 
is derived for a rectangular region in the radius-period plane, it is worth noting that the 
boundaries of this “desert” region are better described as a triangle and extend out to 
5-10d in mp vs. a and Rp vs. P planes (see Figure 1 and Figure 4 of [Mazeh et al.[2016]. 
found that the frequency of planets inside this region depends on the 
host star metallicity in a way similar to the frequency of hot Jupiters, and they dubbed this 
population as “Hoptunes” (rather than “hot Neptunes") to reflect that not all of them were 
known Neptune-like physically. Out of our baseline sample of 61 planets in this region, 14 
are observed to have planetary companions, and the periods for majority of these compan- 
ions are within 10d. The observed multiplicity rate is thus 2396, which is lower than the 
Kepler average while higher than that of hot Jupiters (see also|Dong et al.]2018). We refer 
to[Dawson & Johnson] (2018) for more discussions on the connection of this population with 
close-in Jupiters and related theoretical implications. 

A number of theories have been proposed to explain the formation of planets in this 


region (e.g.,|Kurokawa & Nakamoto|2014| |Matsakos & Kónigl]2016| |Lundkvist et al.|2016 
Bailey & Batygin||2 i The leading explanations of its triangular 
boundaries invoke photoevaporation (see more discussion in Section]2.1.4) and tidal effects 


following the high-eccentricity migration. The upper boundary is best explained as the tidal 
disruption barrier for gas giants following their high-eccentricity migrations (Matsakos & 


2016 2018). More massive planets can be tidally circularized closer to 


the star without tidal disruption, resulting in the negative slope of the upper boundary. It 
has been proposed that the same mechanism also produces the lower boundary, with the 
positive slope resulting from a mass-radius relation of small planets that is different from 


the relation of giant planets (Matsakos & Kónigl]2016). However, this mechanism may not 
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be able to explain the planets that are in or near the “desert” region and reside in multi- 
planet systems. An alternative theory, proposed by [Owen & Lai] (2018), suggests that the 
lower boundary is better explained by the photoevaporation of highly irradiated planets, 
and that the positive slope results from the fact that the photoevaporation mechanism is 
more effective if the planet is closer to the host star. There has been a growing interest of 
planets in this region with the TESS mission (e.g., [Armstrong et al.]2020|[Burt et al.[2020), 
and the follow-up studies of such planets will soon allow for a better understanding of their 
physical properties and formation mechanisms. 


2.1.3. Ultra-short-period planets. Planets with radii between 0.5-2 Rg and periods P < 1d, 
known as ultra-short-period planets (USPs), represent a rather extreme planet popula- 
tion. The period threshold for USPs at one day corresponds to an equilibrium temperature 
~ 2000K for a Sun-like host, which is hot enough to sublimate dust grains. Below we 
briefly summarize several key properties of USPs and refer interested readers to the recent 


comprehensive review by (2018) for more discussions about this extreme planet 


population. 


Our statistical analysis yields nj = A 2 04)% for USPs. This is in general agreement 


with the result of|S Sanchis-Ojeda et al. Sanchis-Ojeda et al. et al. , whose specialized pipeline yields ñp = (0.51+ 


0.07)% for planets with radii in the range m 8- 2 Rag, and P < 1d. Out of the 81 USPs in our 
sample, 16 are found with outer planetary companions, indicating an observed multiplicity 


rate of 20%. The true multiplicity rate is probably much higher, since USPs can be largely 
misaligned relative to the outer planetary companions Petrovich et al. 
[2019]. In 13 of the 16 multi-planet systems involving USPs, the closest outer companion 
has P. S 10d, and the USP is usually farther apart in terms of the period ratio from the 


~ 


rest of the planets in the same system (see also Steffen & Farr|2013). 


The highly irradiative environment at sub-day orbit implies that USPs are unlikely to 
have formed in situ. Partially because of the comparable rates between hot Jupiters and 
USPs, it had been suggested that USPs could be the surviving cores of tidally disrupted 
hot Jupiters (Jackson et al.[2013), but this was not supported by several pieces of evidence 
including the lack of strong host metallicity dependence and the relatively 
high multiplicity rate compared to hot Jupiters. A more plausible scenario is that the USPs 
have arrived at their current locations without losing much of their initial mass. One way 
of achieving this is the gradual decay of the orbit due to the tidal dissipation within the 
host star (Lee & Chiang| 2017). Alternatively, the proto-USP planet may have been sent to 
an eccentric (and misaligned) orbit following the dynamical interactions with other planets 
in the E and then the orbit decays and circularizes due to the tidal dissipation within 
pec acum TA a 
model sees its support in the relatively large mutual inclinations of USPs (Dai et al./2018} 
Additionally, in order for the tidal inspiral model to produce USPs, the tidal EE in 
USP hosts needs to be efficient, but the population analysis on stellar kinematic ages seems 


to suggest otherwise (Hamer & Schlaufman]|2020). 


2.1.4. Radius valley. An important discovery in the field of exoplanet in recent years is the 
radius valley, which refers to a region in the radius-period plane at radii Rp ~ 2 Rẹ and 


periods between ~3-30 days (Fulton et al.|2017| |Van Eylen et al.|2018| |Fulton & Petigura 


2018). This radius valley is visible in our statistical sample (see Figure|2). The position of 


(Van Eylen et al.12018 


the valley in radius is reported to decrease with the orbital period 


www.annualreviews.org « Exoplanet Statistics and Theoretical Implications 


18 


Rescaled radius R, (Re) 


3 10 30 100 


"m 
o 


(g, h) = (0, 0) 
(g, h) = (—0.09, 0) 
(g, h) = (—0.09, 0.26) 


RJ cuo M 
u Oo Ww 


(3 <P/day<30) 


a 
uw 


Number of planets per 100 stars 


e 
pin 
o 


1.4 2 24 3 4 


Orbital period (days) Rescaled radius R, (Ro) 


Figure 4 


(a) The zoom-in view of the Kepler planets in our sample centered at the radius valley. The y-axis shows the rescaled 
radius Rp = Rp(P/10 days)~9(Mx/Mo)~" (see Equation [12] and we adopt the best-fit g = —0.09 
2018) and h = 0.26 (Berger et_al.[2020a). The black box marks the boundary within which planets are used to derive the 
intrinsic radius distribution. (b) The intrinsic distribution of the rescaled radius Rp. The radius gap, highlighted in the 
gray band, is most prominent when both period and stellar mass dependences are taken into account. 
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and increase with the stellar mass (Wul[2019 2020a). The two dependences 


can be parameterized as 
g h 
i — a M 12. 
RUY 10 days Mo 


The valley position at orbital period P = 10 days and host mass M, = Mo is found to be 
Re = 1.9 + 0.2 Ra and the slope quantifying the period dependence is g = —0.09*002 
(Van Eylen et al.| 2018). The slope quantifying the stellar mass dependence is h = 0.26+0-21 
(Berger et al.||2020a). With the above relation one can then highlight the radius valley 
by rescaling the radius to Rp = Rj(P/10days) ?(M,/Mc) ^. Figure illustrates our 


sample in this rescaled radius (with g = —0.09 and h = 0.26) vs. orbital period plane. We 


also show the intrinsic distribution of the rescaled radius in Figure for planets with 
Rp in the range 1-4 Rẹ and P in the range 3-30 days. Our choice of the period upper 
boundary is motivated by Figure [2] beyond ~ 30days the number of detections in the 
relevant region and thus the statistical power drops significantly. The peak-to-dip contrast 


in our “radius” distribution is not as significant as that shown in |Fulton et al.|(2017) and 
Fulton & Petigura| (2018). In particular, our rescaled radius distribution does not show 


an obvious single peak at Rp < Ry'**. We have tried with the same period range as 
used in those studies and confirm that our specific choice of the period range is not the 
cause of this difference. One possible reason is the different statistical methods used to 
infer the occurrence rate: As discussed in Section [1-2] the IDEM approach used in [Fulton] 


(2017) and|Fulton & Petigura|(2018) tends to underestimate the occurrence rates at 
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low sensitivity regions (Rp ~ Rg). The fact that the radius distribution does not seem to 
decrease at sub-Earth sizes suggests the presence of many undiscovered sub-Earths. The 
broader radius distribution may also imply that the planetary mass distribution is not as 
narrowly peaked as some previous studies inferred (e.g., Wu]2019). 

The leading theory for the radius valley is the atmospheric evaporation driven by high- 


energy photons from the host star (photoevaporation; 
[Owen & Wu[2017). In fact, the existence of the radius valley at approximately the 
discovered position had been predicted years before its discovery 
see a historic overview in[Owen[2019], which is exceptional in exoplanetary 


science. The photoevaporation of the atmosphere is thought to mostly take place during 
the early ages of the system when the star emits a higher fraction of its total luminosity 


at high energy (S 100 Myr; e.g., 2012 [Tu et al.|2015 but also see [King] 
& Wheatley|2021). For close-in (~ 3-30days) planets with core masses of a few Me, 


the high-energy radiation is sufficient to unbind the entire hydrogen/helium atmosphere if 


its initial mass fraction is below some critical value (a few percent; 2017). 


The radius valley thus emerges, separating planets with and without extended atmospheres 
[Owen & WulZ017). The observed period and 
stellar mass dependences can also be well explained by photoevaporation. As the orbital 
period increases and/or the host mass decreases, the amount of high-energy radiation the 


planet receives decreases and thus the valley moves to smaller radii (Owen & Wul/2017 
2019)[] We refer interested readers to[Owen| (2019) for a comprehensive review on the 


photoevaporation mechanism. 
According to the photoevaporation theory, the properties (e.g., location and shape) of 
the radius valley depend on the underlying planetary properties, especially distributions of 


the core mass, core composition, and atmospheric mass fraction 
[& Fortney|2013). Therefore, the observed radius valley opens up a venue to statistically 
infer the properties of close-in low-mass planets at birth 
[2018] [Wu[2019] [Rogers & Owen][2020). Assuming that photoevaporation is the underlying 


mechanism, these studies collectively point to a typical core mass of a few Ma, a core 
composition similar to that of the Earth (i.e., rich in silicate/iron and poor in water/ice), 
and a typical atmosphere mass fraction at birth of a few percent. These inferred properties 
have important implications to the formation and migration history of these close-in planets 
(see Section [4]. 

While photoevaporation has seen its success in predicting and explaining the radius 
valley, alternative theories exist that can also explain the observed valley (e.g., 
[Lee & Connors|2020), of which the core-powered mass-loss mechanism is con- 
sidered the main competing theory. Unlike photoevaporation, the energy source for at- 
mosphere stripping in core-powered mass-loss mechanism is the internal luminosity of the 
cooling core, and this process is expected to operate on much longer timescales (~ Gyr) 


(Ginzburg et al.|2018). The observed period and stellar mass dependences of the radius 
valley (Equation |12) are also consistent with this mechanism (Gupta & Schlichting|2019 


7 Although later-type stars have higher fractions of the total luminosity emitted in higher energy 
(x My 3. and remain active for a longer period of time, these lower-mass stars 
have much lower total luminosities (cc M2 for Solar and later-type stars). The lifetime-integrated 
high-energy radiation at a certain orbital separation is shown to decrease with decreasing stellar 


mass (see Figure 4 of|McDonald et al./2019). 
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. Similar to photoevaporation, core-powered mass-loss mechanism also supports that 
= - low-mass planets have predominantly rocky cores with low water-ice fractions 
(Gupta & Schlichting[2019). 

Attempts have been made to identify which of the two mechanisms discussed above is 
more responsible for the observed features. These studies made use of either the different 
stellar mass or age dependences of the two mechanisms (e. g. [Hirano et al.[2018] Berger et al.] 
[20202]. However, the currently available data provide no conclusive result to distinguish 
between the two. Larger samples and/or more precise measurements of stellar properties 
will be needed. 


2.2. Mutual inclinations and the intrinsic multiplicity 


'The mutual inclination distribution of planets in multi-planet systems conveys important 
information on the formation and dynamical evolution of planetary systems. However, 
currently employed detection techniques are usually incapable of directly measuring mutual 
inclinations. This is particularly true for RV and microlensing. The transit technique is 
strongly biased toward (nearly) coplanar systems. Nevertheless, advancements have made 
it possible to statistically infer the mutual inclination distribution from the Kepler data. 

'The key issue in constraining the mutual inclination distribution with transit is the 
strong degeneracy with the intrinsic multiplicity (e.g., 
[Dong]|2012). Specifically, with the observable multiplicity function of transit alone one 
cannot distinguish between high-multiplicity systems with large mutual inclinations and 
low-multiplicity systems with small mutual inclinations. We therefore combine mutual 
inclinations and intrinsic multiplicity in the same discussion. 

Before discussing the statistically inferred mutual inclinations, we briefly overview a 
handful of systems with measured large mutual inclinations. By combining HST astrom- 
etry and ground-based RV measurements, measured the mutual 
inclination between two of the three planetary companions in the Upsilon Andromeda sys- 
tem to be about 30°. performed photo-dynamical modeling of 
the transit timing variation (TTV) and transit duration variation (TDV) signals of the 
Kepler-108 system and found the mutual inclination to be AI = 2474 ? between the two 
transiting planets. The pi Mensae system, which hosts a long-period giant planet and a 


TESS transiting super Earth (Huang et al.|2018| |Gandolfi et al.| Gandolfi et al.|2018), is reported to have 


significant mutual inclinations (~ 30-150?) a joint analyses of the Hipparcos and Gaia 


DR2 astrometry (Kuan & Wyatt]2020) [Damasso ot alJ2020) [De Rosa et al |2020). Addi 


tionally, some USP systems have also been determined to have large mutual ee 
(e.g., 2018). More planetary systems with large mutual inclinations are expected 
to be found in the following years, especially with Gaia’s capability to determine the 3D 


orbital configurations (Perryman et al.|2014). 


2.2.1. The weighted transit duration method. A popular method to statistically infer the 
mutual inclination of Kepler muti-planet systems makes use of the ratio of transit chord 


lengths (Steffen ct al [2010) 


€ _ pp v m (1 + rin)? in bz. 13 
i PP UT e + Tout)? > bout f i 
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The subscripts “in” and “out” denote values of the inner and the outer transiting planets, 
respectively. Here T measures the time from the first to the last contact points of transit, r is 
the planet-to-star radius ratio, and b is the transit impact parameter. As both T' and period 
P are precisely measured from transit data (Seager & Mallén-Ornelas[2003), the parameter 
€ is well determined from observations. The last expression in Equation [13]is used to con- 
struct the € distribution from models with assumed mutual inclination distributions. When 
two transiting planets are exactly coplanar, the ratio bin/bout = Gin/Gout = (Pin / Pour)? 
is precisely measured and thus the parameter € only concerns one poorly constrained fidu- 
cial parameter (either bin or bout, since both rin and rout are reasonably well measured). 
The distribution of € for coplanar systems is thus expected to narrowly peak at unity. In 
practice, the observed distribution is not so narrow, because of the introduction of the 
mutual inclination (see Figure [BID P] Applying this weighted transit duration method to 
large samples of Kepler planet pairs, and [Fabrycky et al.] 
found that the mutual inclinations between transiting planets in the Kepler multi-planet 
systems could be well described by a Rayleigh distribution with a dispersion of a few de- 
grees (< 3-5°). This has been frequently interpreted as multi-planet systems being nearly 
coplanar. However, with the use of only transiting planet pairs, which preferentially have 
small mutual inclinations, the weighted transit duration method cannot well determine the 
higher end of the mutual inclination distribution. As an extreme case, even the isotropic 
distribution of orbital inclinations cannot be reliably ruled out with the use of transit data 


alone (Tremaine & Dong|2012). 


2.2.2. "Kepler dichotomy”. To recover the true mutual inclination distribution, one needs 
to break its strong degeneracy with intrinsic multiplicity. The first attempt was carried 


out by (2011). The authors tried different functional forms for the intrinsic 


multiplicity distribution (uniform, Poisson, and exponential) as well as for the mutual incli- 


nation distribution (uniform and Rayleigh; see also|Sandford et al.12019). By modeling the 
intrinsic multiplicity as a uniform (or Poisson) distribution and the mutual inclination as a 


Rayleigh distribution, were able to find matches to all observed tran- 
sit multiplicities except the transit singles. Specifically, their models would under-predict 
the number of systems with only one transiting planets by nearly 5096. 'This signals the 
failure of their simplified model. Nevertheless, this feature was picked up by many others 
and phrased as the evidence for two distinct populations of planetary systems (the so-called 
“Kepler dichotomy”): In one population planetary systems have small mutual inclinations 
and relatively compact configurations, whereas in the other population planetary systems 


have either only one planet or at least two largely mutually inclined planets (e.g., 


Ballard & Johnson[2016] Mulders et al. [2018] [He et al. [2019]. Taking the Ke- 


pler sample as a whole, in terms of distributions of many properties of stars (e.g., stellar 
mass, metallicity) and planets (e.g., period), transit singles and transit multis are statisti- 


cally consistent with being drawn from the same parent population (e.g., [Xie et al.[2016 
Munoz Romero & Kempton||2018 2018b [Weiss et al.|[2018a), suggesting that 


they probably have the same origin. 
While modeling the mutual inclination as a Rayleigh distribution (or more generally, 


Fisher distribution; |Tremaine & Dong|2012 2018b) seems a reasonable choice (see 


*'The orbital eccentricity e in principle also affects the £ distribution, but its contribution is 


relatively minor and thus e cannot be well constrained with this method (Fabrycky et al.12014). 


www.annualreviews.org + Exoplanet Statistics and Theoretical Implications 


18 


also [Tremaine[2015), the proper functional form for the intrinsic multiplicity distribution 
remains an open question. Nevertheless, it is certainly oversimplified to assume that all 


planetary systems have the same number of planets (e.g.,|Ballard & Johnson|2016 
et al./2018). Having a Poisson distribution for the intrinsic multiplicity (e.g.,|Lissauer et al. 


is likely not justified, either. The 
underlying assumption behind the Poisson distribution is that occurrences and properties 
of individual planets around the same host are independent from each other. While it has 
not been proved invalid for Kepler planets, there is emerging evidence that the presence 
and properties of planets inside the same system may be correlated due to the shared 
formation environment and/or host properties (see Sections and 2.5). Furthermore, 
the exponential or power-law (i.e., Zipfian distribution; forms can 
be securely ruled out. These distributions predict overly abundant intrinsic single-planet 
systems, which is not supported by TTV observations (see below). 

Given the strong degeneracies, disentangling the intrinsic multiplicity function and 
the mutual inclination distribution therefore requires external information. To this end, 


Tremaine & Dong| (2012) developed a general statistical framework to account for obser- 


vational biases of different techniques. Applying their method to planetary systems found 


by Kepler and RV,|Tremaine & Dong) (2012) found that the mean mutual inclination dis- 


persion, which was assumed to be the same for all multiplicities, should be < 5? and that 


the intrinsic multiplicity function could not be constrained. See fora 
different attempt in combining Kepler and RV data. 

also pointed out an observational feature that was difficult for 
their models to explain. As originally noticed by [Ford et al.] (2011), the fraction of systems 


showing TTV signals does not seem to vary significantly with the transit multiplicity, except 
perhaps for very high (> 4) multiplicities (see also [Xie et al.[2014]. A similar feature also 
shows up in later large and uniform T'TV searches, which consistently found that nearly 
half of the TTV detections were from systems with only one transiting planets 


2016 2018). This indicates that planets in transit singles have almost the 


same probability to show T'TV signals as planets in transit multis. 


2.2.3. Multiplicity-dependent mutual inclinations. The assumption that the mutual incli- 
nation distribution is independent of the intrinsic multiplicity may not be valid. With all 
else being equal, the critical mutual inclination for long-term instability is probably depen- 
dent on the number of planets in the system (e.g., see also Section -4-2). 
Observationally, one also finds that the distribution of the £ parameter appears statistically 
different for different transit multiplicities. As shown in Figure Bh, lower transit multiplic- 
ities have broader € distributions that are suggesting larger mutual inclinations (see also 
Hc ct al 2020). 

introduced the following relation between the mutual inclination 
dispersion, c;, and the intrinsic multiplicity (within Kepler window), k, 


oi(k) = 0.8? (5) 14. 


They applied the statistical framework of|Tremaine & Dong|(2012) and combined the transit 


and TTV statistics to infer the intrinsic multiplicity and mutual inclination distributions. 


TTV, as a detection technique (Agol et al.|2005| |Holman & Murray||2005), applies to the 


same population of planetary systems as transit, and thus the combination of TTV and 
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(a) The cumulative distribution functions (CDFs) of the weighted transit duration ratio, £ (Equation [13]. for different 
transit multiplicities. Larger transit multiplicities tend to have narrower £ distributions, suggesting smaller mutual 
inclination dispersions o;. (b) The CDFs of the normalized transit duration, T'/To (Equation [15), for different transit 
multiplicities. Here 7o is the transit duration for a circular and coplanar orbit. Larger transit multiplicities have narrower 
T/To distributions, indicating smaller eccentricity dispersions ce. 


transit is free from many assumptions and selection biases (compared to the use of RV; e.g., 
[Tremaine & Dong|2012). found that the intrinsic multiplicity and the 
mutual inclination dispersion should be strongly correlated, with —4 < ¢ < —2 at the 2c 
confidence level (see Figure [6h for an illustration). In other words, systems with fewer 
planets are dynamically hotter. This result also points to large mutual inclinations (= 10?) 
for 2-planet and 3-planet systems. A recent work by [He et al.] found a qualitatively 
similar (although statistically different) result with a best-fit ¢ = —1.7 from modeling 
a collection of Kepler statistics (including transit multiplicities, the period distribution, 
period ratio distribution, etc) and imposing the angular momentum deficit (AMD) stability 


criterion (Laskar|1997| Laskar & Petit|2017) in simulated planetary systems. It is also worth 


noting that such a relation is steeper than the similar relation inferred from RV eccentricities 


(Limbach & Turner|2015), ergodic models (‘Tremaine||2015), or the extrapolations of the 


empirical stability boundary (e.g., 2015). 


Zhu et al.|(2018b) also reported constraints on the intrinsic multiplicity vector, which is 


reproduced in Figure [6b. Although the individual components of the multiplicity vector 


are not well constrained, the summed fraction is well measured to be 30 +3% and does not 
rely on many assumptions like the other measurements do (see Section 5.1 of|Zhu et al. 
see also Section [1.1]. 'The resulting average multiplicity in the Kepler parameter 
space is my = 3.0 + 0.3. This serves a lower bound on the average multiplicity in the 


inner (S 1AU) region, as smaller planets below the detection threshold of Kepler are 
unconstrained. 
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Figure 6 


(a) Distributions of eccentricity and mutual inclination dispersions as functions of the intrinsic multiplicity. Here Kepler 
window is roughly the region above the gray curve in Figure [2] The relation ge = g; is assumed (see Section [.3-2). The 
¢ parameter quantifies the strength of the correlation with the intrinsic multiplicity (Equation [14]. 'The orange band 
denotes the inferred average multiplicity Mp for Kepler systems. The eccentricity dispersion inferred from transit singles 
(i.e., k > 1) and the mutual inclination dispersion constraint from systems with five Kepler transiting planets are shown as 
black squares. (b) The inferred intrinsic multiplicity vector and the associated probability distribution functions. The 
fraction of Sun-like stars with more than seven planets in the Kepler window is limited to « 2.296 (9596 upper limit). The 
medians and the 16-84% ranges of individual components are denoted with squares and error bars, respectively. Poisson 
distributions with different values of mean parameter A are shown for references. Both plots are adapted from [Zhu et al.] 


(20135). 


2.3. Eccentricity distribution 


Similar to mutual inclinations, orbital eccentricities also provide important information on 
the formation and dynamical evolution of planetary systems. Here we focus on the eccen- 
tricity results from the Kepler sample. Readers can find discussions about eccentricities 
from RV in the Section 3.1 of[Winn & Fabrycky] (2015). 

'The majority of the eccentricity measurements of individual Kepler planets were made 
through modeling the TTV and TDV signals (e.g., 
[2017). These studies have found that the eccentricities of 


Kepler planets in near-resonance pairs are typically small, with a Rayleigh dispersion of up 
to a few percent. However, the planets selected for such dynamical modelings are probably 
a biased sample, and thus the derived eccentricity distribution may not be representative 
of the more general population. 
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2.3.1. The transit duration method. The transit duration (between the first and the fourth 
contact points) is given byf] 


2 
T oap p m 15. 
To 


1+esinw 


Parameters r, b, and T are the same as those in Equation and w is the argument of 
periapsis. The quantity To measures the transit duration between the first (second) and 
the third (fourth) contact points of a planet with the same period but circular (e — 0) and 
edge-on (b = 0) orbit and is related to the mean density of the host star, px, via 


1/3 —1/3 
foede -13hr (=) (=) , 16. 
Ta yr po 


With known parameters from transit modeling (b, r, P, and T) and the nuisance parameter 
w assumed to follow a uniform distribution, the quantity T'/To can be used to constrain the 


statistical distribution of e, provided that the stellar mean density is precisely measured 
(Ford et al.|[2008]. With other parameters being the same, larger eccentricities lead to 
broader distributions of the T'/To ratio (see Figure Bb). The successful application of this 
method heavily depends on the accurate characterizations of the host stars. As a result, 
early attempts to study the Kepler sample were all limited by the systematic uncertainties 


in the stellar properties (e.g.,|Moorhead et al./2011 2012| |Plavchan et al.]2014). 
2.3.2. Multiplicity-dependent eccentricity distribution. [Van Eylen & Albrecht] (2015) ap- 


plied à variant of the transit duration method to a carefully selected sample of Kepler 
multi-planet systems whose host stars were precisely characterized via asteroseismology. 
These authors found that the eccentricities of planets in their sample could be well de- 
scribed by a Rayleigh distribution with oe œ~ 0.05. Using accurate spectroscopic stellar 
parameters from LAMOST, [Xie et al. (2016) found similar nearly-circular orbits for planets 
in the Kepler multis, and they reported a much larger eccentricity dispersion (oe ~ 0.3) for 
Kepler planets in systems with single transiting planets. Both results have been confirmed 
by later works (Van Eylen ct al 2015) Milis eta. [2019). 

The multiplicity-dependent eccentricity distribution goes beyond the single vs. multiple 
bifurcation. This is demonstrated in Figure Bp, where we show the cumulative distribu- 
tions of the T/To ratios derived from our planet sample for different transit multiplicities. 
Here we have used the stellar mean densities from isochrone fits by (2020b) 
and the values of T' from the Kepler DR25 MCMC chains 
Figure [5p indicates, the distribution of T/To ratio becomes narrower with increasing tran- 
sit multiplicities. As the transit multiplicity can be viewed as a rough proxy of the intrinsic 
planet multiplicity, it is suggestive that planetary systems with more planets have smaller 
eccentricity dispersions. This is also qualitatively consistent with studies of the RV planets 


(Limbach & Turner|2015 2017). Based on observations of solar system and 


the general expectation that the dispersions of orbital eccentricity and mutual inclination 


9Note that our definition of the transit duration follows that of|Seager & Mallén-Ornelas| (2003) 
and is different from that of/Winn & Fabrycky|(2015). The latter measures the duration between 


two points where the planetary center sits on the edge of the projected stellar surface (see Figure 2 
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we may use the same relation between intrinsic multiplicity and mutual inclination disper- 
sion (Equation for the relation between intrinsic multiplicity and orbital eccentricity 
dispersion. The multiplicity-dependent eccentricity dispersion is also shown in Figure [6h 
(see also [He et al.2020). 

The large eccentricities and mutual inclinations of Kepler low-multiples have important 
theoretical implications. The largest eccentricity that can be achieved via scatterings among 
small Kepler planets themselves can be roughly estimated as: 


1/2 1/2 1/2 
toe wf Oe Span] ee ( - ) zu. so WW 
Uorb M, Rp = 0.1 AU 2 Re 


Here Vesce and Vorb are the surface escape velocity and orbital velocity of the planet, respec- 
tively. The evaluation takes the typical values of a Kepler planet. While the above scaling 
relation bears some significant uncertainties, the large eccentricities (oe œ% 0.3) and mutual 
inclinations (c; = 10°) observed in the low-multiplicity planetary systems are probably on 
the high end of the distribution. It suggests that these planetary systems may have under- 
gone significant dynamical interactions among the inner planets themselves. Alternatively, 
other mechanisms may have been invoked to excite eccentricities and mutual inclinations to 
values larger than what the self-scatterings can achieve. One promising mechanism is the 
interaction between the inner system and the outer massive planets (e.g.,|Johansen et al. 


and references therein). We return to this point 


in Section [3.2] 


2.4. Intra-system variation 


The intra-system variation, which is about the relative properties of planets around the same 
host, is useful in constraining the formation and evolution processes of planetary systems. 
It also concerns the statistical inference of exoplanets in general: In some statistical studies, 
planet detections from the same star are treated as independent events (see Sections 
and 2-2}; in some others, specific assumptions about the relative properties of planets in 
multi-planet systems must be made when synthetic systems are generated (e.g., 


2018 2019). The derived statistics to some extent are subject to the validity 


of such assumptions. 


2.4.1. “Peas in a pod?”. Transiting planets in the same Kepler multi-planet systems pref- 
erentially have similar sizes. This feature has been noticed since the early days of the Kepler 


mission (Lissauer et al.12011 2013). Follow-up observations that provided im- 


proved characterizations of the host stars enabled further studies that tried to understand 


the nature of this feature (Weiss et al.|2018b Weiss & Petigura 
2020| |Murchikova & Tremaine}/2020). In particular, 


(2018b) quantified the 
correlation between sizes of neighboring planets around the same host in their sample. To 
check the statistical significance of this correlation, they generated synthetic systems by 
randomly drawing planetary radii from the observed size distribution and then performed 
the same correlation test. The size correlations in their synthetic systems were much weaker 
than what they saw in real systems, and thus they concluded the pattern was astrophysical. 
'Together with a similar result on the spacings between planets, con- 
cluded that planets in Kepler multi-planet systems have similar sizes and regular spacings, 
a pattern they termed *peas in a pod" (see also for a similar claim 
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about Kepler planet masses). A later study by [He et al.] (2019) reached a similar conclusion. 


According to these authors, planetary systems that contain clusters of planets whose sizes 
and orbital periods are correlated produce a better match to the observed Kepler systems 
in terms of the joint statistics of transit depth distribution, period distribution, period ratio 
distribution, etc[.?] 

Different opinions exist about the nature of the observed correlations. 
pointed out a detection bias that was underestimated in the statistical method of|Weiss et al.] 
(20185). Because small planets can be detected around bright and quiet stars whereas large 
planets are only detectable around faint or noisy stars, the same transit detection threshold 
(i.e., a fixed S/N) naturally leads to varying planetary size thresholds in different systems. 
This, combined with the fact that smaller planets are more abundant, naturally leads to 


a size correlation in the observed transit pairs (see also |Murchikova & Tremaine| 2020). 


However, it appears that the apparent correlation in planetary sizes is too strong to be 
explained entirely by this detection bias alone (Znu[2020). 

Another factor that has not been fully explored is the contribution of the planets that are 
missing, due to large impact parameters or sub-threshold values of transit S/N, in known 
Kepler multi-planet systems. Our solar system is an excellent example to demonstrate 
this point. The four outer giant planets would be unlikely to be detected by a transit 
mission similar to Kepler because of their long orbital periods. Of the four terrestrial 
planets, Mercury and Mars are almost impossible to detect in transit due to their small 
sizes. Therefore, a Kepler-like mission would, if possible at all, most likely detect the Venus- 
Earth planet pair, which shows very similar sizes (0.95 Re vs. 1 Rẹ) and masses (0.82 Ma 
vs. 1 Mẹ). However, this level of similarity is not representative among the solar system 
planet pairs. 

The physical interpretation of the size correlation (if any) is also unclear. One interpre- 
tation is that planets “know” about their siblings, namely the formations of two neighboring 


planets are directly correlated (e.g., 
2020). Another interpretation is that planets “know” 
about the system and the environment they formed in, namely the formations of planets in 
the same system are all related to some global properties (Murchikova & Tremaine[2020). 
In this latter case, the apparent correlation between planetary sizes is only a projection of 
the correlation between the individual planets and the host star (or the birth disk). This 
latter interpretation has some observational evidence. For example, the planet distribution 
is shown to depend on the orbital period (see Section [2.1) and stellar properties (see Sec- 
tion [2.5). demonstrated with a toy model that the observed 
size correlation could be well reproduced if the planets “know” about the host star but do 
not “know” about their neighbor planets. 


2.4.2. Orbital spacings. The relative positions of planets in Kepler multi-planet systems 
have also drawn lots of interest. The majority of the early studies focused on the pe- 
riod ratio distribution. As shown in Figure Kepler systems contain very few planet 


pairs near/in low-order mean-motion resonances (see also 2011| |Fabrycky 


2014). This is in contrast with earlier RV results that a substantial fraction of 


10The clustered model of [He et al. (2019) has more free parameters than their non-clustered 


model. However, the authors did not perform model comparisons to justify the introduction of 
more flexibilities. See [Zhu] (2020) for more discussions. 


www.annualreviews.org + Exoplanet Statistics and Theoretical Implications 


23 


24 


1.0 


LL. 
N 0.5 
100 
30 
Y 
10 
2-tranet 
3-tranet 
4-tranet 
> 5-tranet 
MEME PT 1 All pairs 
0.1 0.3 1 3 10 0.5 1.0 
Pout/Pin — 1 CDF 
Figure 7 


Spacings between the apparently adjacent planets in Kepler multi-planet systems, with different 
colors indicating planet pairs from different transit multiplicities. The x-axis of the main panel 
(lower left) shows the spacing in terms of the orbital period ratio, and the upper panel shows the 
corresponding cumulative distributions. A few example period commensurabilities are indicated in 
both panels. The y-axis of the main panel shows the spacing in terms of the mutual Hill radii 
(Equation [18]. and the right panel shows the corresponding cumulative distributions. The 
stability thresholds for 3-tranet, 4-tranet, and 5-tranet systems, derived according to 

Equation are indicated with solid horizontal lines. Values corresponding to twice of the 
thresholds are also shown as dashed horizontal lines. The code Forecaster from 
is used to predict the planet mass based on the planetary radius, and we have revised the 
upper mass limit to 10? Mg (~ 3 Mj) to avoid masses beyond the planetary regime. Solar system 
planet pairs are also indicated for references. 


well-characterized multi-planet systems contain pairs of giant planets close to mean-motion 


resonances (e.g.,| Wright et al./2011). We refer to Section [4-1]for the theoretical implications 


of this feature. Additionally, the asymmetry around exact period commensurabilities has 
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also attracted lots of attention ( 


Fabrycky et al.12014), and we refer interested readers to the 
fairly comprehensive overview by|Terquem & Papaloizou 
Millholland & Laughlin|2 


dynamical compactness of the Kepler multi-planet systems, which concerns the long-term 


for this particular issue (see 
also the recent development by . This review focuses on the 
stability and thus the dynamical evolution. 

When the stability of the planetary system is concerned, the orbital spacing between 
planets is usually expressed in the dimensionless parameter K 


18. 


Gout — Gin Gin + Gout ( Min + Mout NT 

A xe a [m ) 
Here Ry is called the mutual Hill radius, M, is the mass of the host star, and ain (aout) and 
Min (Mout) are the semi-major axis and the mass of the inner (outer) planet, respectively. 
For two-planet systems, the condition for the long-term stability (and thus instability) has 
been well understood theoretically, and the instability arises when there are mean-motion 
resonance overlaps 2018). For systems 
with more than two planets, we lack a good theoretical understanding on the origin of 
the dynamical instability (see attempts by Chambers et al.[1996] [Zhou et al.|2007] 
[Yalinewich & Petrovich|2020). Nevertheless, numerical studies have shown that the 


timescale before which close encounter occurs between planets, t, scales exponentially with 


the initial spacing K (Chambers et al.||1996). Details of this scaling relation depend on 
factors such as the number of planets, planet masses, orbital eccentricities and inclinations, 


as well as the inhomogeneity ET planets (e.g., putes [Zhou et al.[2007] 
[Funk et al.[2010] see|Pu & Wu|2015|for a recent summary). 


In the context of Kepler end systems, (2015) ) found through numerical 


simulations that the median spacing for stability could be approximated as 


Oe | Oi 
(K) = 2.87 + 0.7 logio T 4 2a (2) H (=) 19. 


where 7 is the physical timescale t scaled by the orbital period of the innermost planet, ex 
is the mutual Hill radius scaled by the semi-major axis of the innermost planet, and ce and 


ci are the dispersions of orbital eccentricities and mutual inclinations among the planets, 
respectively. With the multiplicity-dependent ce and c; (Equation and the typical 
values for Kepler systems (t Z Gyr old and the innermost planet of planet-to-star mass 
ratio q zz 10 ? at 0.1 AU), Equation [19] yields 


k Ç 
(K) ~ 10.2 + 2.2 (5) 20. 


With ¢ = —2 (Zhu et al.|2018b| [He et al.|[2020}, planetary systems with (3, 4, 5) Kepler 


planets should have critical spacings (X) = (16, 14, 12), respectively. 

We apply the above stability thresholds to the multi-planet systems from Section [2.1] 
and discuss the limitations. After the use of Kepler's third law, the only unknown to deter- 
mine the spacing parameter K is the planet-to-star mass ratio. We estimate the planetary 
masses from the measured radii with the Forecaster code from{Chen & Kipping] (2017) and 
adopt the Gaia stellar mass from [Berger et al.] (2020b). Systems without reported stellar 
mass measurements are excluded. Figure [7] illustrates the spacings between neighboring 
Kepler planets of all systems and systems divided into different transit multiplicities. For 
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transit multiplicities of 3, 4, and 5+, the majority (~ 70%) of planet pairs have spacings 
above the corresponding stability thresholds, confirming that they are indeed (most likely) 
long-term stable. The remaining ~ 30% planet pairs, considered long-term unstable by the 
above empirical thresholds, are probably stable as well. While part of this misclassification 
is due to the choice of fixed Kepler system parameters and the empirical (but sometimes 
unphysical) mass-radius relation (Chen & Kipping|[2017), it nevertheless is a sign for the 
failure of the empirically determined stability criteria. In particular, these stability crite- 
ria do not take into account the impact of mean-motion resonances, which can be either 
protective or destructive to the involved planets. 

Nevertheless, by applying the empirical stability thresholds to the data one finds that the 
majority of Kepler planet pairs are not far from the empirical stability limits: The median 
spacing of all planet pairs is K zz 20, and about 80-90% of planet pairs from systems with at 
least three transiting planets have spacings within twice of the empirical stability thresholds 
(horizontal dashed lines in Figure|7]. These results are consistent with previous findings 


(e.g., [Weiss et al.|2018b) and also suggest that for 


the majority of Kepler planet pairs there is no room for inserting another (undetected) 
planet in between (Fang & Margot]2013). In other words, the observed Kepler planets 
are dynamically packed. However, it does not necessarily mean that Kepler systems do 
not contain additional planets. The space to the innermost and particularly the outermost 
Kepler planet allows the existence of additional planets without risking instability. For 
example, seven planets with q = 10 ? are allowed per factor of 10 in semi-major axis if 
mutually separated by K = 20. The observed dynamically packed structure is also probably 
due to the selection bias that it is increasingly difficult for both planets in a wider-spacing 
pair to transit the host star. 

As part of the “peas in a pod” claim (see Section [2.4.1}, the spacings between Kepler 
planets in the same multi-planet system are found to be statistically similar 
2018b). However, the observed correlation in spacings is driven by a small fraction (< 5%) 
of systems containing the highest multiplicities, and the majority of systems do not show 


such a regular spacing pattern Jiang et al.]2020). 


2.5. Dependence of planet statistics on stellar properties 


2.5.1. Impact of stellar companions. Stellar companions to the planet hosts affect the Ke- 
pler planet statistics in several ways. In transit surveys like Kepler, many of them appear 
unresolved and dilute the transit signals, potentially leading to misclassifications and erro- 


neous planetary parameters (Ciardi et al.| 2015 2018) El Thankfully, follow-up 


high-resolution imaging observations have been performed for nearly all Kepler planet can- 
didates (e.g., and references therein). For bright 
targets that contain Jupiter-like transits, also performed systematic 
radial-velocity follow-up observations and identified a significant false positive rate (55%) 
for Jovian planet candidates. These efforts have led to a much better understanding of 
the impact of transit dilution on the Kepler planet statistics. In particular, [Furlan et al] 
reported that about 10% (30%) of the candidate host stars have observed compan- 
ions within 1” (4”), the majority of which are fairly faint compared to the target stars. 


llHere an ambient star that is not physically associated with the target is also considered a 
companion to the target star. 
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In the most likely scenario that the transit signals come from the primary stars (see, e.g. 


2018), the dilution effect overall only affects the planetary radii up to a few 
percent on average (Furlan et al.|2017). This is within the uncertainty of Gaia-derived radii, 


and thus one does not expect it to have a significant impact on the general planet statistics. 
However, Earth-sized planets Rp < 2 Re are much more susceptible to the dilution effect, 
and thus the relevant statistics may suffer a more dramatic impact 
Bouma et al.|2018). 

Besides the transit dilution effect, stellar companions can also affect the presence of plan- 
ets through many dynamical processes (e.g.,[Artymowicz & Lubow]1994| {Holman & Wiegert] 
[1999]. Very close (a < 0.5 AU) stellar binaries can host circumbinary (i.e., planetary-type 
or P-type) planets, and over a dozen such systems have been found (see Section 6.2 of 
[& Fabrycky|2015). We limit our discussions to the observational aspects of circumstellar 
(i.e., satellite-type or S-type) planets and the implications on planet statistics. 

Studies based on RV and high-resolution imaging observations suggest that the existence 
of a close stellar-mass companion is usually associated with a lower frequency of circumstel- 
lar planets (c. [Wang etal 2014) 2015] Ngo etal 2016) [Kraus ct aL UIG] Moe & Kratter 
2019). The effect of binary on the presence of close-in planets is quantified by a suppression 
factor Spin, which is the ratio between the fraction of planet hosts with stellar companions 


and the fraction of field stars with the same type of companions (Kraus et al./2016 
2019). Circumstellar planets are almost completely suppressed (Spin < 1596) when 


the stellar companions are close (with separation apin < 10 AU), regardless of the planetary 
size or observed multiplicities. Planets are nearly unaffected (Spin Z 85%) if the stellar 


N 


companions are distant (abin Z 100 AU). At intermediate separations (~ 10-100 AU), the 


^2 


suppression effect gradually decreases with the increasing separation. See the Figure 3 of 
for a compilation of observational studies and an illustration of the 
suppression factor Spin as a function of the binary separation. 

With the above suppression effect and the known binary separation distribution, one 
can then infer the planet formation efficiency from the measured planetary system frequency 


Fy. (2019) estimated that Fyin ~ 43% of Sun-like primaries in a magnitude- 


limited survey like Kepler could not host close-in (S 1 AU) planets simply because of the 
influence of binary companions (see also [Kraus et al.|/2016). If these targets are excluded 
from the Kepler statistics, one finds that the formation efficiency of close-in planets around 
single stars, a parameter directly related to formation theories, should be 1/(1— Fyin) = 1.8 
times higher than the fraction of stars with planets Fp. This additional factor also provides 
a plausible explanation to the discrepancy in hot Jupiter frequencies measured from RV 


and Kepler (see Section 2.1.1]. 


2.5.2. Metallicity effect. Under the general assumption that the bulk metallicity of the host 
star is correlated to the total mass of building blocks available for planet formation, it is 
reasonable to believe that the planetary occurrence rate and properties may be correlated 
with the host star metallicity. For giant planets (Rp = 8 Rẹ or mp = 0.3 Mj) found 
by RV, it has been well established that their presence correlates strongly with the host 


metallicity (e.g., [Santos et al.[2001 Fischer & Valenti|2005). This giant planet-metallicity 


correlation lends support to the core accretion model as the leading theory for the formation 


of giant planets (e.g., Pollack et al.11996| Ida & Lin|2004b). Some recent studies have also 


claimed that hosts of eccentric giant planets are more metal-rich than hosts of nearly circular 


giant planets (Dawson & Murray-Clay|2013| |Buchhave et al./2018), but stronger statistical 


www.annualreviews.org + Exoplanet Statistics and Theoretical Implications 


27 


l-tranet 2-tranet 3-tranet = 4-tranet o 1-4Re o 4-8Re QOQ8-20Re 


104 


10* 


Period (days) 
m= 
Z 


m 
O 
ne 


Figure 8 


[Fe/H] 


An illustration of the Kepler planetary systems in our baseline sample that have spectroscopic metallicity measurements. 
We use different colors to separate different observed multiplicities (out to 400 days) and different label sizes to separate 
planets of different sizes (small as 1-4 Re, intermediate as 4-8 Rg, and giant as 8-20 Ra). Cold (P > 400 days) planets 
found by RV (as tabulated in[Zhu & Wu[2018) and long-period transit searches are also 
indicated with thick circles. As the host metallicity increases, the system becomes more likely to contain giant planets at 
all periods and small planets at relatively close-in (P < 10 days) orbits (e.g., [Mulders et al-]2016] [Dong et al.J2018| 
[Petigura et al.[2018). At very high metallicities ([Fe/H]Z 0.2), there seems to be a deficit of compact systems (with > 4 


transiting planets) and planets at intermediate orbits (~10-400 days). These may be related to the emerging cold giants 


(Zhu & Wu/2018). The median metallicity of Kepler field stars is [Fe/H]~ 0.0 (Dong et al.|2014b). 
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evidence is needed to fully establish this result. 
Small planets, in particular those with radii Rp < 4 Ro, show weaker dependences on 


host metallicity (e.g., [Buchhave et al.|2012). While many studies have 
focused on the dependence of the planet frequency ñp on host metallicity (e.g., 
Petigura et al.)2018) and theoretical implications Cn Oo 
‘Lee}2019), one may argue that the planetary system frequency Fp is probably a more 
suitable parameter to characterize the efficiency of planet formation under such system-wide 


parameters like metallicity (Zhu et al.|2016 2019). If the general planet—metallicity 


relation 
F, x 107Fe/#l 21. 


is applied, the result of [Zhu] suggests y = 0.5 for all Kepler-type planets, which 
is much weaker than the giant planet-metallicity correlation (y & 2; 
2005). The dependence is further reduced if the close binaries that show anti-correlation 
with stellar metallicity are excluded from the statistics (Moe et al.[2019] Kutra & Wu[2020). 
Unlike the planetary system frequency Fp, the planet frequency rij does not appear to have 
a monotonic relation with the host metallicity. In particular, it may start declining when 
the metallicity is high enough (Znu[2019). It has been suggested that this behavior may be 


Zhu € Dong 


related to the formation of giant planets inside the same system: as the metallicity is high 
enough, the system has a significant probability to form giant planets, and these giants may 
reduce the multiplicity of the inner system either because they prohibit the formation of 
more small planets or because they dynamically remove some of the small planets out of 
the inner system. This scenario may also explain the increased diversity of planets around 
metal-rich Kepler hosts (Petigura et al.|2018) and the over-abundant compact planetary 
systems around metal-poor stars (Zhu & Wu)2018| [Brewer et al.]2018). Figure [8] displays 
along the host metallicity [Fe/H] the Kepler systems with metallicity measurements in our 
baseline sample. 

While the stellar bulk metallicity measured in iron abundance [Fe/H] (or a mix of metals 
[m/H]) is usually used in studies of the planet metallicity dependence, other elemental 
abundances, in particular œ elements and refractory elements, have also been looked for 


possible correlations with planet properties (e.g., Adibekyan et al.|2012 2016 
2019). No clear trends have been found so far, probably due to the limited 


sample size, the measurement precision, and/or the impact of Galactic chemical evolution. 


2.5.3. Dependence on stellar mass. A number of studies have also investigated the depen- 
dence of planet frequency on host mass. A theoretical possibility is that, the stellar mass 
correlates with the total mass in the protoplanetary disk and thus the amount of solid 
materials available for planet formation. It is largely consistent with direct observations of 


protoplanetary disks in (sub-)millimeter wavelengths (Andrews et al.|2013| |Ansdell et al. 


2016), although at a fixed stellar mass the scatters of inferred disk masses remain substantial 
(up to an order of magnitude; [Ansdell et al.[2016]. 

We would like to start by pointing out several potential issues. Similar to the metallicity 
dependence (see Section 2-5-2), the two frequencies, Fp and ñp, can behave differently, 
especially for the small planets with high multiplicity rates. Second, as more massive stars 
also tend to be more metal-rich, one may need to carefully disentangle possible correlation 
between stellar mass and metallicity in the sample (c.g., [Johnson et al.[2010] 
2020). Furthermore, the choice of the parameter to study the correlation may matter. While 
planetary radius (or mass) and orbital period (or semi-major axis) are commonly used in 
statistical studies, Nature may prefer other physical units such as the planet-to-star mass 
ratio or the position of the water snow line see 
Figure [10] for an illustration). Last but not least, as far as the planet formation efficiency 
is concerned, one must correct for the suppression effect due to close stellar binaries (see 
Section |2.5.1). It is established that the close binary fraction correlates with the primary 
mass (e.g., Duchéne & Kraus|[2013], so the suppression effect is expected to affect the 
statistics of planets around different stellar masses differently (Moe & Kratter[2019). 

'The dependence of giant planets on stellar mass has been investigated in many studies 
with different detection methods (e-g., Johnson et al. [2007] [2010] [Howard et al.[2012] [Fressin] 
[et al. [2013] [Nielsen et a1. [2019]. To avoid many of the issues listed above, here we focus on 
the results from long-term RV surveys, as they cover a broad range of parameter space and 
are nearly free of close stellar binaries. In particular, Johnson et al.|(2010) analyzed a sample 
of 1266 stars with at least 3-year RV observations and masses spanning from 0.2 Mo up to 
their estimated 1.9 Mo, and reported a linear relation between the occurrence rate [?] and 


12Their derived occurrence rate is technically Np, but because of the low multiplicity rate of giant 
planets it closely approximates the rate Fp. 
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stellar mass. This result has been widely considered as a benchmark in both theoretical and 
observational studies of giant planets. The higher-mass part of the sample come from the 
so-called “retired A-stars," and their spectroscopic mass estimates are controversial 
and references therein). Recently, the 
asteroseismic study by (2020) shows that the“retired A-stars" with spectroscopic 
masses > 1.6 Mo are overestimated, confirming the earlier reports by (2011) and 
(2013). Since such stars consist of the heavier half of the 
(2010) sample and contribute most of the statistical evidence to the reported stellar 
mass dependence (see the Figure 4 of Johnson et al.}2010), a revisit of the mass correlation 
will be needed. Additionally, the result of (2010) is limited to the region 


with separation a < 2.5 AU. As shown by |Clanton & Gaudi| (2014) and |Clanton & Gaudi 


(2016), after those at (slightly) larger separations are taken into account, giant planets are 


almost as common around M-dwarfs as they are around Sun-like stars (see Section [3.3]. 
For small planets, Kepler survey provides the best sample to study their stellar mass 
dependence. Studies have shown that the planet frequency ñp in the Kepler parameter space 


is anti-correlated with stellar mass (e.g., 2012) |Mulders et al.[2015a]b). Using 
the (2018| |2020b) sample with stellar effective temperature between 4000- 


5000 K, we find a frequency of ñp = 3.3 + 0.4 for planets in the radius range 1-20 Re and 


period < 400 days, which is a factor of ~ 2.7 higher than the rate for our baseline Sun-like 
sample (Section 2.1]. Later M-type stars have even more planets 
[2015]. The planetary system frequency Fp is also anti-correlated with stellar mass 
but likely at a weaker level, due to the increased average planet multiplicity around later- 
type stars (e.g., [Yang et al.|2020). 'There is some sign of increased observed multiplicity 
rate in our 4000-5000 K sample (48.2%) compared to that (42.5%) of our baseline Sun-like 
star sample (Section D.1). After the correction for the suppression effect due to close stellar 
companions, the difference in formation efficiencies of small planets between single Sun-like 
and later-type hosts is likely further reduced, although an anti-correlation probably remains 


(Moe & Kratter[2019). 


3. THE OUTER PLANET POPULATION 


In the earliest stage of planet formation, the region beyond ~ 1 AU is expected to contain 
most of the mass and the angular momentum of the protoplanetary disk. Therefore, the 
frequency and properties of planets in this outer region (~ 1-10 AU) have important im- 
plications to the formation and evolution of the whole system, including the planets in the 
inner ~ 1AU region. In this section, we review our current understanding of this outer 
planet population and discuss its connection with the inner planetary system. 

We set the inner and outer boundary at ~ 1 AU partly because this is approximately 
the detection limit of the Kepler mission, but also because it coincides with the position 
where giant planets show a rapid rise in frequency. RV surveys have found that giant planets 
(mp Z Msat ~ 0.3 M5) appear ~ 5 times more often between ~ 1-3 AU than they do within 


^ 1 AU (Cumming et al. [2008]. 


3.1. Planet Frequency 


RV surveys have found that cold giant planets (0.3-13 My) at ~1-5 AU) appear around 
on the order of ~ 10% of Sun-like stars. If the giant planet distribution is modeled as a 
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parametric function that joins single power-law distributions of mass and orbital period 
(Tabachnik & Tremaine||2002), the integrated rate out to P ~ 5.5years is found to be 
ñp = 0.105 (Cumming et al.||2008). Such a single power-law period distribution tends to 
over-predict the number of giant planets at wider (= 10 AU) separations. To better match 
the observed distribution, (2019) replaced it with a broken power law and 
found a potential peak at ~2-3 AU (see also 2016). Extending their distribution 


function out to 100 AU, [Fernandes et al.] (2019) found rp = 0.27*005 and 0.062*0015 for 
planets in the mass range 0.1-20 M; and 1-20 Mj, respectively. If the frequency of the so- 


called Jupiter analogs, namely Jupiter-mass (~0.3-3 M;) planets in Jupiter-like (a few AU) 
orbits around Sun-like hosts, are concerned, several independent studies have collectively 
pointed to a rate about a few percent (e.g., [Wittenmyer et al.[2016]and references therein), 
suggesting that planetary systems similar to our own may be relatively uncommon (see 
also Section [3.2). Unlike our Jupiter, a significant fraction of cold giant exoplanets are on 


substantially eccentric orbits with typical eccentricities e ~ 0.3 (e.g., |Wright et al.||2009). 
We refer to the review by |Winn & Fabrycky| (2015) for more discussions on these topics. 


Although the region beyond ~ 1 AU is nominally out of the reach of Kepler, studies have 
nevertheless systematically searched for and statistically studied the long-period transiting 


planets in the Kepler data (e.g., |Foreman-Mackey et al.|2016 [Herman et al.[2019 
hara & Masuda|2019). In particular, (2019) reported an occurrence rate of 


ñp = 0.7* 05 for planets with sizes between 0.3-1 Ry and orbital periods between 2-10 years. 
The inferred radius distribution also suggests that cold Neptune-sized (3-5 Ma) planets are 
~4 times more common than cold Jupiter-sized (7.5-11 Mẹ) ones. This is broadly consis- 
tent with the result from microlensing surveys (see Section [3.3), pointing to the potential 
existence of a large and unexplored low-mass planet population in the outer region. 


3.2. The inner—outer correlation 


The planetary systems inside and outside of ~ 1 AU appear strongly correlated. Such a 
strong inner-outer correlation has important implications to the formation and evolution 
of the system as a whole. Below we review the observational evidence and discuss briefly 
the implications of this correlation. More on the latter will be presented in Section [4:2] 
We highlight two classes of inner planets, hot Jupiters and super Earths, and discuss them 
separately below. 


3.2.1. "Friends" with close-in Jupiters. Hot Jupiters, while usually having no detectable 
planetary companions in the inner region, are frequently found to have distant massive 


companions 2014 2016} but see also |Schlaufman & Winn 


2016). Both of these features are important clues to the formation and evolution of hot 
Jupiters, and we refer interested readers to the review by [Dawson & Johnson] (2018) for 
in-depth discussions. 

For the completeness of the discussion about the inner—outer correlation, we briefly 
summarize here the key result of the “friends of hot Jupiters” search. 
conducted a systematic RV study of the distant companions to a sample of 51 hot 
Jupiters and reported that each hot Jupiter should have on average 0.51 + 0.10 companions 
with masses between 1-13 M; and semi-major axes between 1-20 AU. This sample was re- 


analyzed in |Bryan et al.| (2016) with improved sensitivity calculations, and the companion 


rate was revised to 0.703: 0.08. Given the small fraction of systems with more than one cold 
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companion, we take this average number to be approximately the fraction of hot Jupiter 
hosts with cold Jupiter companions. This fraction barely changes after we adjust to the 
parameter range used in this work (0.3-13 M; and 1-10 AU) according to the planet distri- 
bution function of (2016). We denote this fraction as P(CJ|HJ). Additionally, 
given the known fractions of Sun-like stars with hot Jupiters and cold Jupiters, P(HJ) ~ 1% 
and P(CJ) ~ 10%, respectively, the inversed conditional probability is P(HJ|CJ) ~ 7%. 
'This is the fraction of cold Jupiter hosts with hot Jupiters. All the four fractions are shown 
in Figure [9] 

Jupiter-sized planets in the inner region with known outer giant companions tend to 
have higher eccentricities, suggesting possible dynamical interactions in sculpting the ar- 
chitectures of these systems (e.g., [Bryan et al. [2016]. In particular, warm (~ 10-100 days) 
Jupiters on significantly eccentric orbits (e = 0.4) have much higher chances to possess 
relatively close (S 3 AU) Jovian companions compared to those on nearly circular (e < 0.1) 


orbits (Dong et al.|2014a), and the existence of such companions are consistent with the 
high-eccentricity migration scenario to form eccentric warm Jupiters (Dong et al.|/2014a 
Dawson & Chiang||2014| |Petrovich & Tremaine||2016). By contrast, warm Jupiters on 


nearly circular orbits show a weaker correlation with outer giant companions, and many of 
these warm Jupiters are found to have nearby small planetary companions 
2016). These features cannot be easily reconciled in the high-eccentricity migration scenario, 
suggesting that the nearly circular warm Jupiters may have been formed in situ or have 


undergone the disk-driven migration (e.g., [Raymond et al.|2008| |Hallatt & Lee|2020). We 
refer to|Dawson & Johnson| (2018) for more comprehensive discussion on the observations 


and theories related to warm Jupiters. 


3.2.2. Super Earth-cold Jupiter relation. The term “super Earth" has different meanings 
in different studies. Here we call a planet super Earth if its mass (or radius) is between 
the masses (or radii) of Earth and Neptune, and the correlation under discussion applies 
specifically to the super Earths from the inner region. These super Earths dominate the 
known inner planet population, and they can co-exist with almost all types of inner plan- 
ets except hot Jupiters (see Section D-1). For this reason, this super Earth population is 
representative of the inner planet population. 

About 1/3 of the inner super Earths have outer cold Jupiter companions, as studies 


have shown (Zhu & Wu|2018 [Bryan et al.[2019). The RV signal on the star induced by a 


super Earth is systematically smaller than the RV signal induced by a cold Jupiter. Making 


use of this point, |Zhu & Wu|(2018) constructed a sample of 54 super Earth systems around 


Sun-like hosts that received long-term RV observations, and they found that the fraction 


of super Earth hosts with cold Jupiter companions is 32 + 8%. This is ~ 3 times higher 
than the frequency of cold Jupiters around field Sun-like stars. The fraction further rises 
to ~ 60% for metal-rich systems (with [Fe/H]> 0.1). These results were later confirmed by 
the independent study of (2019). In that work the authors refit RV data sets 


of 65 super Earth hosts, some of which are M dwarfs, and reported an occurrence rate of 


39 + 7% for companions with masses in the range 0.5-20 M; and semi-major axes in the 
range 1-20 AU. In this review, we take a rather conservative value of P(CJ|SE) ~ 30%, 
which is also shown in Figure [9] 

The inversion of the above conditional probability reveals an even more interesting 
result. With ~ 30% of Sun-like stars hosting inner super Earths and ~ 10% of Sun- 
like stars hosting cold Jupiters, one finds from the Bayes theorem that P(SE|CJ) ~ 90%, 
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Figure 9 


Correlations between inner planets and outer cold Jupiters. Although only two types of inner 
planets are highlighted here, they are representative of the known inner planet population: The 
majority of hot Jupiters do not have close neighbors (see Section D-1-1], whereas super Earths 
usually reside in systems that include other types of inner planets (see Section D.2]. 'The 
unconditional probability shown here is the fraction of Sun-like stars with a specific type of 
planets, and the conditional probability is the fraction of Sun-like stars with a specific type of 
planets given that another type of planets is present in the system. 


suggesting that nearly all of the cold Jupiters should have inner small planets 
Bryan et al.|2019). Together with the fraction of cold Jupiter hosts with hot Jupiters, 
P(HJ|CJ) z 796, outer giant planets almost all have inner companions|"*] We illustrate in 
Figure [9]the connections between the outer giant planets and the two representative types 
of inner planets. 

'The above strong correlations are also confirmed by studies that utilized the rare but 


valuable long-period Kepler transiting planets (Uehara et al.|/2016 2019 
Masuda et al.||2020). These studies find that the fraction of long-period (P = 2 years) 


transiting planets with inner transiting companions is so high that it can only be explained 
by a strong inner-outer correlation. They also reported evidence that the dynamical hotness 
of inner and outer planets may also be correlated. Specifically, a dynamically hot outer 


13Tt is possible that hot Jupiters were born cold and that their later evolution had cleared out 
the small planets originally present in the inner region. This would mean that essentially all cold 
Jupiters were born with inner small planets. 
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Jupiter is likely associated with a dynamically hot inner planetary system. This provides a 
plausible explanation for the surprisingly large eccentricities and mutual inclinations of the 


inner systems with low multiplicities (Masuda et al.|2020| see Section D.3]. It may also help 
explain the reduced super Earth multiplicities around metal-rich stars (Zhu & Wu||2018 


see Section 2.5.2). 

'The strong inner-outer correlation has implications on the frequency of planetary sys- 
tems similar to our own. On the one hand, this correlation suggests that the general solar 
system-like architecture—with the inner region containing small planets and the outer re- 
gion containing giant planets—is probably common among other planetary systems. On the 
other hand, planetary systems with properties very similar to ours, namely a system with 
both outer Jupiter-like (~ M; at a few AU) and inner Earth-like (Z Mg within 1 AU) plan- 
ets may be rare (S 1%, Zhu & Wu[2018). A possible explanation could be that our Jupiter 
formed very early and hence prevented the growth of inner embryos into super Earths 
[Izidoro et al.[2015]. 'This early Jupiter formation scenario also explains 
the isotope measurements on solar system iron meteorites (Kruijer et a1.|2017), although 


the question remains why the majority of cold Jupiters in other systems do have inner super 
Earths. We defer to Section [4-2] for further discussions of theoretical implications. 
Observationally, the strong inner-outer correlation implies interesting synergies between 
space-based transit missions and astrometric missions or ground-based long-term RV sur- 
veys. Indeed, at least two of the TESS transiting planets have been found around stars 
with known RV cold Jupiters (Huang et al.[2018] [Teske et al.[2020). Future combined TESS 
and Gaia planet catalogs should yield hundreds of similar systems that can enable detailed 
studies of the system architecture, as has been demonstrated in the pi Mensae system 


& Wyatt}2020) [Damaso et al [2020] [De Rosa et al 2020). 


3.3. Mass-ratio function from microlensing 


Gravitational microlensing probes a largely uncharted planet discovery space of cold planets 


[Gould & Loeb[1992), where > 99% of the planetary mass of the 
solar system resides. Ground-based microlensing surveys are sensitive to planets down to 
Earth masses (e.g., [Bennett & Rhie[1996] [Dong et a1.[2006), and a space-based survey will 
be capable of discovering all solar system planet analogs except Mercury 
see also Figure [1). With the increasing number of discoveries, microlensing searches have 


been continuing to unveil the distribution of planets in this under-explored parameter space 
and offer insights into the planet formation outside the water snow line. We refer interested 
readers to [Gaudi] for an overview of the microlensing technique and its application 
in exoplanet discoveries (see also |Mao|2012). Below we focus on the important progress 
made since the review of |Gaudi| (2012). 

Several recent studies found evidence for a possible turnover in the planet-to-star mass 
ratio function for planets beyond the water snow line (see Figure [10]. In the early era, a 
key microlensing finding was that cold Neptunes (with planet-to-star mass ratio q ~ 1074) 


are a factor of a few more common than cold Jupiters (with q ~ 107°, Gould et al.|2006b), 
and [Sumi et al.] (2010) found that the distribution of the mass ratio q could be described 


by a power law dN/dlog,)q x q”, where v = —0.7+0.2. Using the planet sample from 


the MOA-II microlensing survey, (2016) found that a single power law of the 


mass ratio function does not extend to very low mass ratios. Specifically, these authors 


reported a break in the mass ratio function at q ~ 1074, corresponding to the mass of 
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Figure 10 


Similar to Figure [1] but here shows the planet-to-star mass ratio vs. the semi-major axis in units 
of the water snow line. The snow line is at 2.7 AU for 1 Mọ star and scales linearly with the host 
mass (Kennedy & Kenyon[2008). The location of the snow line is indicated with a black dashed 
line (note that this is only for illustrative purposes as the snow line should be determined in the 
protoplanetary disk), and the possible turnover in the mass ratio distribution found from 
microlensing, 0.55-1.7 x 1074 (Suzuki et al.|2016| Jung et al. 2019), is marked with the gray 
region. Six solar system planets are shown, with Mercury and Mars too low in mass ratio to 
appear on this plot. 


Neptune for a typical host star mass of 0.5 Meo. Based on a total sample of 30 planets that 


combines the MOA-II and previous statistical samples (Gould et al.||2010 
2012), (2016) reported a broken power-law mass ratio function with a break at 


Qork = 1.7 x 1074. The power-law indexes above and below the break are v = —0.93 + 0.13 


and 0675 respectively. The normalization is such that ny = 0.79 for planets with mass 
ratio q > 5 x 10 ? and projected separation s in units of Einstein radius in the range 0.3- 
5. For typical microlenses with 0.5 Mo, these numbers correspond to the planetary mass 
My > 8 Me and the orbital separation between 1-15 AU. The reported planet frequency 
is compatible with those from other detection techniques (i.e., RV and direct imaging) 


following a simple joint planet distribution function (Clanton & Gaudi|2014| |2016). 


Further studies by |Udalski et al.|(2018), who studied an ensemble of seven (as compared 


to four in 2016) planets with q < 1074, and (2019), who analyzed 


a sample of 15 planets with q < 3 x 1074, investigated the possible turnover in the mass 


ratio function. Adopting a power-law form of the detection efficiency, |Jung et al.| (2019) 


modeled the intrinsic mass-ratio distribution with a broken power law and revised the 
break to qprk & 5.5 x 1075, which is a factor of three below the value found by |Suzuki et al. 
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(2016), but their low-mass planet sample was too small to distinguish a pile-up at that mass 
ratio from broken power law. Nevertheless, a break or pile-up in the planet-to-star mass 
ratio function could have important theoretical implications (e.g., [Pascucci et al.[2018] [Wu] 
2019), and further probing the distribution of sub-Neptune microlensing planets will be a 
research focus in the near future. Observations from high-cadence and nearly continuous 
microlensing surveys such as KMTNet (Kim et al.|2016 


planets at low mass ratios (e.g., q = 1.8 x 10~° from q & 1.4 x 107? 
from q & 1.1 x 10? from ), so a large enough sample is 
expected to be available soon to improve the determination of the mass ratio function at 
the low end (see [Zang et al.[2021] and discussions therein). 

Another interesting feature of the mass ratio function of is its 


apparent smoothness between Neptune and Jupiter masses. Intriguingly, the derived radius 


) are pushing toward detecting more 


distribution of cold planets from the single transit events in Kepler also appears similarly 
continuous between Neptune and Saturn (Herman et al./2019). These results are surprising 
in view of the standard core accretion MEDIE Md which builds on the solar 
system and predicts a deficit of planets at such intermediate masses/radii (Ida & Lin|2004a| 


2009). This tension may suggest that the giant planet formation involves 
physical processes that have been overlooked in the standard models (Suzuki et al.|2018). 


Alternatively, it could be due to the limited sample sizes of cold intermediate-mass planets 
(Suzuki et al. sample contains nine detections in the range 1074 < q < 5 x 1074 and 
has four in their intermediate radius bin of 0.67-1.00 Ry). Therefore, 
increasing the sample size of cold intermediate-mass planets will clarify the degree of tension 
between observation and theory. Furthermore, physical mass (rather than mass ratio) 
determinations of a large sample of microlensing planets through measurements of the 
microlensing parallax or the lens flux (e.g., Dong et al.[2009) are needed to enable a more 
direct comparison with theories. This will be possible for essentially all microlensing planets 
detected to date at first light of adaptive optics on 30m class telescopes (e.g.,|Skidmore et al. 
2015) or for a significant fraction of planet hosts in a space-based microlensing survey such 


as the Roman microlensing survey (Penny et al.|2019). 


3.4. Free-floating planets 


The prevalence of eccentric and/or inclined planetary orbits suggest likely histories of violent 
dynamical interactions in the planetary systems, such as planet-planet scatterings, which 
naturally eject a significant fraction of the planets from the system and form unbound 
planets with no hosts (eg. 
2008). The distributions of free-floating planets bear important signatures of not only the 
initial configurations of the planetary systems at birth, but also their subsequent dynamical 
evolution. 

While it is possible to directly image young sub-stellar objects down to a few Jupiter 
masses (see, e.g.,|Zapatero Osorio et al.|2000), gravitational microlensing is the only known 
method in probing the lower-mass objects, which are believed to dominate the dynamically 
ejected FFP population. Low-mass objects produce relatively short-timescale microlensing 
light curves as the Einstein radius crossing time tz « VM. For typical stellar-mass mi- 
crolenses, the timescale is ~ 20 days, whereas for planetary-mass objects it is < 1 day. The 
detection of such short and rare events thus demands wide-field high-cadence surveys that 
have only been available since the past decade. 
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(2017) analysed a sample of 2,617 microlensing events from the OGLE- 


IV survey and concluded that the frequency of Jupiter-mass free-floating (or wide-orbit) 
planets should be no more than 0.25 planets per main-sequence star at the 9596 confidence 
level. This result is broadly compatible with the inferred occurrence rate of bound giant 
planets from RV surveys (e.g., 2008), microlensing searches (e.g., 
or direct imaging (e.g., 2016), and contradicts a previous claim that 
free-floating Jupiter-mass planets are more abundant than stars (Sumi et al.|2011). The 
sample of [Mréz et al.] also includes six short events with timescales in the range 


O.ldays < tg < 0.4days. Assuming the microlensing nature of these events and given 
the low detection efficiency at such ultrashort timescales, their sample suggests that there 
may be up to a few free-floating planets in the Earth-mass to super-Earth-mass range per 
main-sequence star. The results of [Mróz et al.] about the absence of free-floating 
Jupiter-mass planets and the potential existence of free-floating Earth-mass and super- 
Earth-mass planets are generally consistent with theoretical expectations (e.g., |Ida et al. 
2016). 

The existence of such ultrashort-timescale events was soon confirmed thanks to the 
coordinated observations of multiple microlensing survey telescopes around the globe. 
reported the first convincing example of a microlensing event with timescale 
tg = 0.32 days, and subsequent dedicated searches led to the discovery of a few more similar 
events [Ryu et al.[2020]. 'These events all show 
strong finite-source effects that arise from the lenses transiting distant giant sources, yielding 
the immediate measurement of the angular Einstein radius 0g. The lens mass scales as 


2 


2 —1 
M(bulge) = —,5- = 250M (8 .) (ma) , 22. 
KT rel 10 uas 16 uas 


with the normalization of the lens-source relative parallax are) chosen such that the lens 
and source are both in the Galactic bulge and separated by about 1 kpc. Here the constant 
k £2 8.14 mas MS For lens in the Galactic disk (mre & 125 was), the mass scales as 


2 2 —1 
M (disk) = 2 - Mo ( a ) ( rel ) ; 23. 


KT rel 10 pas 125 pas 


and [Ryu et al.] argue that Og is a better discriminator than tg for 


selecting FFPs. In fact, from a small number of events with finite-source effects, there is 


a possible gap between ~ 10 pas and ~ 30 uas in the Og distribution, and this “Einstein 
desert" may separate brown dwarfs from free-floating super Earths (and terrestrial planets) 


in the disk (Ryu et al. 2020). We list in Table [1] the relevant parameters and inferred 


masses of the FFP candidate events with 0g < 10 as. The preliminary analyses by [Mróz] 
and [Ryu et al.] suggest that low-mass unbound (or wide-orbit) planets 
may be more common than stars in the Galaxy. Future space-based microlensing surveys 
can assemble a large sample for quantitative assessments of the FFP population (e.g., 
[Johnson et al.[2020), and a satellite augmented with microlensing parallax measurements 
can directly measure the masses and distances of such FFP events (e.g.,|Gould et al. 
and references therein). 

While the events listed in Table[1]are promising candidates for FFPs, it is also plausible 
that these objects are actually in such wide orbits that no microlensing signatures from 
their hosts were detected. Light curve analyses can exclude the existence of any massive 
companions (i.e., hosts) out to a few Einstein radii, corresponding to ~ 15-20 AU away 
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Table 1 Published microlensing free-floating planet candidates (0g < 1l0jas), sorted 
by the inferred lens mass (see Equation [22] and Equation [23] for typical estimates in 
the bulge and disk, respectively). 

Event name  O0g/uas  tg/d M(disk)  M(bulge) Reference 


OGLE-2016-BLG-1928 0.84 0.029 0.2 Mẹ 1.8 Mg 
OGLE-2012-BLG-1323 2.4 0.16 1.8 Mg 14 Mg 
OGLE-2019-BLG-0551 4.4 0.38 6.1 Mg 48 Mg 
KMT-2019-BLG-2073 4.8 0.27 7.6 Mg 59 Mg 
KMT-2017-BLG-2820 5.9 0.29 11 Mag 87 Mg 
OGLE-2016-BLG-1540 9.2 0.32 28 Mg 217 Mg 


(e.g., [Kim et al.][2020). In other words, these FFP candidates could 


well be planets at Uranus-like or Neptune-like orbits (e.g., |Poleski et al.|/2014). Future 
high-resolution imaging observations that can resolve the hosts for wide-separation planets 
will be able to tell whether these objects are truly free-floating or loosely bound to some 


unidentified stellar hosts (Han et al.}2005| Ryu et al.]2020). 


4. THEORETICAL IMPLICATIONS 


'The observed distribution of planets and the architecture of planetary systems, as reviewed 
in Sections and [3] are the consequence of ~10-100 Myr formation and later ~Gyr evo- 
lution. In this section, we discuss the constraints from these observations on theoretical 
models. A comprehensive overview on the formation and evolution theories is beyond the 
scope of the current review. Instead, we focus on the key physical processes that lead to 
observational signatures. To reduce the complexity, we restrict to planetary systems around 
Sun-like hosts. 


4.1. A brief overview of theories 


The generally accepted picture of planet formation can be traced back to the nebular 
model originally proposed by Immanuel Kant and Pierre Laplace in the 1700s. Modern 
theorists generally believe that planets were formed out of the gas and the dust in the pro- 
toplanetary disk. Small solid particles first accumulate to form asteroid-sized (71-100 km) 
planetesimals, and the collisions between planetesimals eventually lead to the formation of 
planet-sized objects (Chamberlin[1916]|Safronov[1972). See|Woolfson| (1993) for a historical 
overview on the planet formation theories. 

In the core accretion theory that explains the solar system formation(e.g., 
Pollack et al. 1996), the primary building blocks for planet formation are planetesimals. The 
growth of planetesimals is first divergent (i.e., the run-away phase) and then convergent 
(i.e., the oligarchic phase), until they have cleared nearly all planetesimals in their feeding 
zone (~ 5 Ry). en so-called protoplanets (or embryos) are now = 1000 km in size and 
around Mars-mass (e.g., [Ida & Makino[1993] [Kokubo & Ida[1998). The further growth of 
the protoplanets involves planetesimal accretion as well as dynamical interactions between 
protoplanets. At a few AU separation, the growth of protoplanets is sufficient and allows 


the formation of giant planets 1980, |Pollack et al.|1996). In the classical picture, 


the giant planet formation has three phases: core formation, hydrostatic gas accretion, 
and run-away gas accretion (Pollack et al.|1996). The hydrostatic gas accretion phase 
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starts when the embryo reaches a critical core mass (~ 10 Me, 1980 


1982). This phase can take up to ~ 10 Myr and is the most time-consuming step in this 
classical core accretion model. The run-away gas accretion is triggered once the envelope 
and the core have comparable masses, and it sufficiently pushes the total mass to the giant 
planet regime (= 100 Mg). In the inner region, embryos grow slowly and never reach the 
critical core mass before the gaseous disk is depleted. The later evolution involves collisions 
between these embryos in the gas-free environment. This so-called giant impact phase lasts 
~ 100 Myr and eventually forms the terrestrial planets (e.g., 2001). 

A new paradigm that has attracted much attention recent years is pebble accretion. In 
the astrophysical context, pebbles are dust particles that are weakly coupled to the gas and 
thus drift in the disk. The inclusion of pebbles in the formation diagram provides a plausible 


scenario for the formation of planetesimals via streaming instability 
[2005] [Johansen et al.[2007| [Chiang & Youdin|2010). Unlike planetesimals that are decoupled 
from the gas, pebbles “feel” the aerodynamic drag from the gas and drift inward toward 
the star (Nakagawa et al.|[1986]. This means that the “food” supply to a protoplanet 
is not limited to the local material. Additionally, the cross-section for protoplanets (or 
planetesimals) to accrete pebbles is larger than the cross-section for the same objects to 
accrete planetesimals [Lambrechts & Johansen|[2012). These two 
factors together make pebble accretion more efficient in building up cores of protoplanets. 
When the protoplanet becomes massive enough, it starts to carve a gap in the pebble disk, 
and the subsequent pressure bump outside of the orbit stops the inward drifting pebbles. 
The corresponding mass of the protoplanet is called the pebble isolation mass 


3 
Mio = 10 (4) Mo , 24. 


where h/r is the disk aspect ratio at location r and the prefactor is determined numerically 


and depends on disk properties (Lambrechts et al.|2014). The above relation assumes a 


solar-mass host star. For other stellar masses, the pebble isolation mass scales linearly 
with the mass of the host star (Liu et al.[2019). Once a protoplanet reaches the pebble 
isolation mass, it effectively cuts off the pebble flux and “starves” the protoplanetary core 
and all embryos interior to its orbit. With the halted pebble accretion, the critical core mass 
required to trigger the rapid gas accretion is reduced and thus giant planets can form more 


efficiently {Lambrechts et al.|[2014). Furthermore, pebbles can easily vaporize in the hot 
envelopes before they can reach the cores (Brouwers et al.|2018). The enriched envelopes 
also speed up the formation of giant MM see also Stevenson[1982] 
and for a similar mechanism in the planetesimal accretion scenario). 
We refer interested readers to the reviews by [Johansen & Lambrechts| and [Ormel] 


(2017) for more details about the pebble accretion model. 
Planets may undergo disk-driven migration while accreting pebbles, planetesimals, 


and/or gas (e.g., |Kley & Nelson|/2012| and references therein). Migration can substan- 


tially change the architecture of the planetary system, such as locking planets into mean 
motion resonances 2002). However, such features 
are not prominent in Kepler systems (see Section|2.4.2). This may suggest that most Kepler 
planets have not undergone significant disk-driven migrations (see also Section [4.2). Alter- 
natively, the Kepler planets may have never entered into resonances during the migration 


(e.g., Goldreich & Schlichting|2014), or the long-term dynamical evolution after the disk 
dispersal has effectively removed most of these features (e.g., 2017| 2019). 
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4.2. Constraints from observations 


Given the substantial uncertainties in theories and in some parts of observations, we think 
that it is premature to provide detailed and quantitative comparisons between theories and 


observations (but see attempts by, e. E Hansen & Murray|2013| Izidoro et al.|2017] 2017] [Mulders| 
[et aL. [2019] Se pios 2020) a ansen k Muray therefore mayis to focus on some selected con- 


straints that are considered ed robust and discuss their implications to the formation 
of Kepler-like planets: 


e Prevalence and multiplicity. Inner super Earth-like planets are known to ex- 
ist around ~ 30% Sun-like stars, and they typically reside in multi-planet systems 
(Section |2.2). Additionally, they preferentially have outer cold Jupiter companions 
(Section |3.2), suggesting that the two types of planets do not inhibit, but perhaps 
promote, the formation of each other. Unlike giant planets, super Earths show a 
much weaker dependence on the host metallicity (Section D..5.2]. 

e Composition. As inferred from population-level studies of the radius valley (Sec- 
tion 2.14] as wl as mass and radii measurements of individual Kepler planets (e.g., 
Wu & Lithwick|[2013} Hadie £ E eale [2017), some inner small planets 
likely have Earth-like (i.e., rocky and ice-poor) cores, and these cores have acquired 
gaseous envelopes that weigh up to a few percent of the total mass while the disk is 
still present. 


'The prevalence and the early formation of super Earth-like planets suggest that the 
planet formation process is more efficient than what had been expected from solar system 
formation models (e.g., [Ida & Lin[2004a] ;Mordasini et al.12009). This alone may not be an 
issue to the pebble accretion scenario (see Section|4.1) In fact, pebble accretion can be so 
efficient that preventing super Earth-mass planets from undergoing run-away gas accretion 
places another challenge, a possible solution to which could be a delayed formation near 
the end of the disk phase (e.g., [Lee et al.|[2014). For the planetesimal accretion scenario, 
a very massive disk is typically required to form super Earths efficiently and early (e.g., 
2020). The rocky composition suggests that the cores are formed in the 
ice-poor environment, likely inside the water ice line. In order for embryos or protoplanets 
from outside of the ice line to not largely contaminate the inner region, the disk-driven 
migration is probably suppressed. 

The strong correlation between inner super Earths and outer cold Jupiters is a bit chal- 
lenging to both accretion scenarios under the typical protoplanetary disk conditions. The 
planetesimal accretion scenario usually requires relatively efficient disk-driven migrations 
to explain the presence of abundant super Earths around metal-poor hosts, but the same 
migration efficiency turns out an overkill in reproducing the inner-outer strong correlation 
(Schlecker et al. see also [Ida & Lin[2010). For the pebble accretion scenario, because 
the solid supply to protoplanets is not limited locally, there is potentially a direct com- 
petition between different embryos. Furthermore, once the core of the outer giant planet 
first reaches the pebble isolation mass, the further growth of the inner planets is signifi- 
cantly limited, and the giant planet also acts as a barrier to the inward migrating embryos 
from outside of its orbit. Therefore, the pebble accretion scenario typically expects an 


14Pebble accretion is efficient in growing planet embryos into larger bodies. However, pebble 
accretion is also lossy, as = 90% of the planet-forming material falls onto the host star rather than 


being accreted onto the growing planets (Liu & Ormel|2018| [Lin et al./2018) 
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anti-correlation between inner and outer planets (Morbidelli et al.]2015 2015 


Lambrechts et al.|[2019). Alternatively, the cores of both inner super Earths and outer 
cold Jupiters could be formed at süch large rem (tens of AU) that enough material 
is available to the inner cores {Bitsch et_al.|[2015]} [2019], although it is unclear whether 
such an approach can reproduce quantitatively the observed correlation and forms rocky 
core planets. The difficulty in reproducing the inner-outer correlation may suggest that 
many protoplanetary disks start heavier than what has been typically assumed. Indeed, if 
Kepler planets are formed in situ based on the local material (i.e., not the inward-drifting 
pebbles), the required surface density is much higher than the minimum-mass solar nebula 


Weidenschilling|1977. 1981) and almost reaches the gravitational instability 
Chiang & Laughlin|2013| |Schlichting!2014). 


model ( 
limit ( 


5. SUMMARY AND DISCUSSION 


The discovery of thousands of exoplanets from the combination of multiple detection tech- 
niques have substantially advanced our understanding of the distribution of planets and the 
architecture of planetary systems. This review aims to update our knowledge of exoplanet 
statistics since the[Winn & Fabrycky] review. In Section P] we described the distri- 
bution and properties of planets in the inner region, based mostly on discoveries from the 
Kepler mission. In Section [3] we reviewed the recent progress on the cold planet popula- 
tion, with an emphasis on their connections to the close-in companions. Section [4] briefly 
described the theoretical models and the key constraints from observations. We summarize 
the key results below: 


SUMMARY POINTS 


1. In the inner region (S 1AU), about 30% of Sun-like stars host planets with 
masses/radii down to Earth mass/radii, and each planetary system on average has 
about three such planets. These suggest that the planet formation process is more 
efficient than what had been expected from solar system formation models. 

2. Planetary systems with more planets appear colder dynamically, with smaller or- 
bital eccentricities, mutual inclinations, and orbital spacings. For systems with few 
planets in the inner region, planets can have ~ 0.3 orbital eccentricities and = 10? 
mutual inclinations. These support the idea that dynamical evolution has played a 
significant role reshaping the system architecture. 

3. There exists a “radius valley” at Rp ~ 2 Rẹ and P < 30days. The valley was 
predicted by the photoevaporation theory, although altemative explanations have 
also been proposed. Population-level analyses of the radius valley suggest that these 
planets were probably born with rocky cores and gaseous atmospheres up to a few 
percent of the core masses. 

4. Cold Neptune-like planets are a few times more abundant than cold Jupiter-like 
ones in the outer region. The inner (< 1 AU) and the outer (~1-10 AU) planetary 
systems appear strongly correlated such that inner small planets preferentially have 
cold Jupiter-like companions and that outer cold Jupiters almost always have inner 
planetary companions. 


With the ongoing and upcoming missions that have better capabilities and/or open 
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up new observational channels, our understanding of exoplanets and planetary systems 
will continue to be improved. We outline below several promising directions that may see 
substantial advancement in the near future: 


FUTURE PROSPECTS 


1. Exoplanet atmosphere and mass-radius relation: Space-based all-sky transit sur- 


veys like the TESS mission (Ricker et al.|/2015) have been finding many bright 


targets, enabling detailed characterizations of more close-in planets (e.g., 
[Armstrong et al.[2020). An improved mass-radius relation and better 
atmospheric characterizations will help to understand the composition and poten- 
tially the past evolution of the planet (see the recent review by|Madhusudhan|2019) 
2. Planetary system architecture: The joint coverage of different surveys will poten- 
tially open up a larger parameter space (see Figure[1) and reveal more interesting 
features about the planetary system architecture. The Gaia mission alone is ex- 
pected to detect at least thousands of giant planets around nearby stars 
[et al.][2014], and its synergy with other surveys/missions will also open up new 
channels into the architecture study (e.g., 
[De Rosa et al.|[2020). RV follow-ups of systems detected by other methods 


could also play increasingly indispensable roles in this aspect, in particular with the 
extreme RV instruments with capability down to ~ 0.3ms ! 
(e.g., Fischer et al.|2016). 


3. Planet across the HR diagram: Rapid advances of large-scale spectroscopic sur- 


now coming online 


veys and the Gaia satellite are continuing to revolutionize stellar astrophysics and 
Galactic astronomy. Further synergies of large samples of exoplanets with detailed 
stellar chemical compositions, kinematics and/or ages measurements are expected 
to place planet formation in the rich context of stellar populations and evolutions. 
4. Planets around young stars: While the present review focused on the planetary 
systems around Z Gyr old stars, the demographics of planets around young (S 
100 Myr) stars is a crucial link toward a more direct comparison with both planet 
formation theories and ALMA observations of protoplanetary disks (see the recent 


review by 2020). The detection and characterization of more planets 
around young stars (e.g., PDS 70b, c, 2019| and AU Microscopii b, 
Plavchan et al.|2020) will be valuable. 
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